Dataset statistics
| Number of variables | 36 |
|---|---|
| Number of observations | 179 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 547.4 KiB |
| Average record size in memory | 3.1 KiB |
Variable types
| Numeric | 14 |
|---|---|
| Categorical | 19 |
| Unsupported | 3 |
lang has constant value "en" | Constant |
LAW has constant value "0" | Constant |
MONEY has constant value "0" | Constant |
Name has a high cardinality: 179 distinct values | High cardinality |
Description has a high cardinality: 179 distinct values | High cardinality |
Description_clean has a high cardinality: 179 distinct values | High cardinality |
word_count is highly correlated with char_count and 1 other fields | High correlation |
char_count is highly correlated with word_count and 1 other fields | High correlation |
sentence_count is highly correlated with word_count and 1 other fields | High correlation |
name_word_count is highly correlated with name_char_count | High correlation |
name_char_count is highly correlated with name_word_count | High correlation |
word_count is highly correlated with char_count and 1 other fields | High correlation |
char_count is highly correlated with word_count and 1 other fields | High correlation |
sentence_count is highly correlated with word_count and 2 other fields | High correlation |
name_word_count is highly correlated with name_char_count | High correlation |
name_char_count is highly correlated with name_word_count | High correlation |
CARDINAL is highly correlated with sentence_count | High correlation |
DATE is highly correlated with ORG | High correlation |
ORG is highly correlated with DATE | High correlation |
word_count is highly correlated with char_count and 1 other fields | High correlation |
char_count is highly correlated with word_count and 1 other fields | High correlation |
sentence_count is highly correlated with word_count and 1 other fields | High correlation |
name_word_count is highly correlated with name_char_count | High correlation |
name_char_count is highly correlated with name_word_count | High correlation |
ORDINAL is highly correlated with LAW and 2 other fields | High correlation |
PERCENT is highly correlated with LAW and 3 other fields | High correlation |
LAW is highly correlated with ORDINAL and 14 other fields | High correlation |
LOC is highly correlated with LAW and 2 other fields | High correlation |
Type is highly correlated with LAW and 2 other fields | High correlation |
lang is highly correlated with ORDINAL and 14 other fields | High correlation |
PRODUCT is highly correlated with LAW and 2 other fields | High correlation |
FAC is highly correlated with LAW and 2 other fields | High correlation |
QUANTITY is highly correlated with PERCENT and 3 other fields | High correlation |
WORK_OF_ART is highly correlated with LAW and 2 other fields | High correlation |
MONEY is highly correlated with ORDINAL and 14 other fields | High correlation |
NORP is highly correlated with LAW and 2 other fields | High correlation |
TIME is highly correlated with LAW and 2 other fields | High correlation |
name_word_count is highly correlated with LAW and 2 other fields | High correlation |
LANGUAGE is highly correlated with LAW and 2 other fields | High correlation |
EVENT is highly correlated with LAW and 2 other fields | High correlation |
df_index is highly correlated with Type | High correlation |
Type is highly correlated with df_index and 1 other fields | High correlation |
word_count is highly correlated with char_count and 7 other fields | High correlation |
char_count is highly correlated with word_count and 7 other fields | High correlation |
sentence_count is highly correlated with word_count and 5 other fields | High correlation |
avg_sentence_length is highly correlated with word_count and 2 other fields | High correlation |
name_word_count is highly correlated with name_char_count and 1 other fields | High correlation |
name_char_count is highly correlated with name_word_count and 1 other fields | High correlation |
name_avg_word_length is highly correlated with Type and 2 other fields | High correlation |
Polarity is highly correlated with avg_sentence_length | High correlation |
CARDINAL is highly correlated with word_count and 5 other fields | High correlation |
DATE is highly correlated with word_count and 10 other fields | High correlation |
EVENT is highly correlated with PERSON | High correlation |
FAC is highly correlated with DATE | High correlation |
GPE is highly correlated with word_count and 2 other fields | High correlation |
LOC is highly correlated with DATE | High correlation |
NORP is highly correlated with PERSON | High correlation |
ORDINAL is highly correlated with CARDINAL and 1 other fields | High correlation |
ORG is highly correlated with sentence_count and 3 other fields | High correlation |
PERCENT is highly correlated with DATE and 1 other fields | High correlation |
PERSON is highly correlated with word_count and 4 other fields | High correlation |
QUANTITY is highly correlated with word_count and 3 other fields | High correlation |
WORK_OF_ART is highly correlated with DATE and 2 other fields | High correlation |
Name is uniformly distributed | Uniform |
Description is uniformly distributed | Uniform |
Description_clean is uniformly distributed | Uniform |
Name has unique values | Unique |
Description has unique values | Unique |
Description_clean has unique values | Unique |
parsed is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
entity_tags is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
entity_types is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Polarity has 17 (9.5%) zeros | Zeros |
CARDINAL has 92 (51.4%) zeros | Zeros |
DATE has 121 (67.6%) zeros | Zeros |
GPE has 134 (74.9%) zeros | Zeros |
ORG has 88 (49.2%) zeros | Zeros |
PERSON has 76 (42.5%) zeros | Zeros |
Reproduction
| Analysis started | 2022-05-09 09:04:43.434693 |
|---|---|
| Analysis finished | 2022-05-09 09:05:11.467447 |
| Duration | 28.03 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
| Distinct | 126 |
|---|---|
| Distinct (%) | 70.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 51.98882682 |
| Minimum | 0 |
|---|---|
| Maximum | 125 |
| Zeros | 1 |
| Zeros (%) | 0.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4.9 |
| Q1 | 22.5 |
| median | 45 |
| Q3 | 80.5 |
| 95-th percentile | 116.1 |
| Maximum | 125 |
| Range | 125 |
| Interquartile range (IQR) | 58 |
Descriptive statistics
| Standard deviation | 35.64068323 |
|---|---|
| Coefficient of variation (CV) | 0.6855450568 |
| Kurtosis | -0.9579792152 |
| Mean | 51.98882682 |
| Median Absolute Deviation (MAD) | 27 |
| Skewness | 0.4575273268 |
| Sum | 9306 |
| Variance | 1270.258301 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 44 | 2 | 1.1% |
| 41 | 2 | 1.1% |
| 29 | 2 | 1.1% |
| 30 | 2 | 1.1% |
| 31 | 2 | 1.1% |
| 32 | 2 | 1.1% |
| 33 | 2 | 1.1% |
| 34 | 2 | 1.1% |
| 36 | 2 | 1.1% |
| 37 | 2 | 1.1% |
| Other values (116) | 159 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 2 | |
| 2 | 2 | |
| 3 | 2 | |
| 4 | 2 | |
| 5 | 2 | |
| 6 | 2 | |
| 7 | 2 | |
| 8 | 2 | |
| 9 | 2 |
| Value | Count | Frequency (%) |
| 125 | 1 | |
| 124 | 1 | |
| 123 | 1 | |
| 122 | 1 | |
| 121 | 1 | |
| 120 | 1 | |
| 119 | 1 | |
| 118 | 1 | |
| 117 | 1 | |
| 116 | 1 |
| Distinct | 179 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.4 KiB |
| Acinetobacter baumannii | 1 |
|---|---|
| Bacteriophage φCb5 | 1 |
| Streptococcus sobrinus | 1 |
| Treponema | 1 |
| Ureaplasma urealyticum | 1 |
| Other values (174) |
Length
| Max length | 32 |
|---|---|
| Median length | 25 |
| Mean length | 17.91061453 |
| Min length | 6 |
Characters and Unicode
| Total characters | 3206 |
|---|---|
| Distinct characters | 65 |
| Distinct categories | 7 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 3 ? |
Unique
| Unique | 179 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | Acinetobacter baumannii |
|---|---|
| 2nd row | Actinomyces israelii |
| 3rd row | Agrobacterium tumefaciens |
| 4th row | Anaplasma |
| 5th row | Anaplasma phagocytophilum |
Common Values
| Value | Count | Frequency (%) |
| Acinetobacter baumannii | 1 | 0.6% |
| Bacteriophage φCb5 | 1 | 0.6% |
| Streptococcus sobrinus | 1 | 0.6% |
| Treponema | 1 | 0.6% |
| Ureaplasma urealyticum | 1 | 0.6% |
| Vibrio | 1 | 0.6% |
| Vibrio cholerae | 1 | 0.6% |
| Vibrio parahaemolyticus | 1 | 0.6% |
| Vibrio vulnificus | 1 | 0.6% |
| Wolbachia | 1 | 0.6% |
| Other values (169) | 169 |
Length
| Value | Count | Frequency (%) |
| streptococcus | 12 | 3.7% |
| bacteriophage | 10 | 3.0% |
| bacillus | 9 | 2.7% |
| mycobacterium | 7 | 2.1% |
| virus | 7 | 2.1% |
| enterococcus | 6 | 1.8% |
| mycoplasma | 6 | 1.8% |
| phage | 5 | 1.5% |
| campylobacter | 4 | 1.2% |
| haemophilus | 4 | 1.2% |
| Other values (216) | 258 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 280 | 8.7% |
| a | 275 | 8.6% |
| i | 266 | 8.3% |
| o | 226 | 7.0% |
| c | 224 | 7.0% |
| r | 201 | 6.3% |
| s | 198 | 6.2% |
| t | 174 | 5.4% |
| l | 165 | 5.1% |
| u | 149 | 4.6% |
| Other values (55) | 1048 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2808 | |
| Uppercase Letter | 214 | 6.7% |
| Space Separator | 149 | 4.6% |
| Decimal Number | 27 | 0.8% |
| Close Punctuation | 3 | 0.1% |
| Open Punctuation | 3 | 0.1% |
| Dash Punctuation | 2 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 280 | |
| a | 275 | |
| i | 266 | |
| o | 226 | 8.0% |
| c | 224 | 8.0% |
| r | 201 | 7.2% |
| s | 198 | 7.1% |
| t | 174 | 6.2% |
| l | 165 | 5.9% |
| u | 149 | 5.3% |
| Other values (17) | 650 |
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 35 | |
| S | 23 | |
| C | 23 | |
| P | 21 | |
| M | 20 | |
| E | 13 | 6.1% |
| L | 12 | 5.6% |
| A | 12 | 5.6% |
| T | 9 | 4.2% |
| V | 7 | 3.3% |
| Other values (14) | 39 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 7 | |
| 1 | 6 | |
| 2 | 5 | |
| 5 | 4 | |
| 7 | 1 | 3.7% |
| 9 | 1 | 3.7% |
| 4 | 1 | 3.7% |
| 3 | 1 | 3.7% |
| 8 | 1 | 3.7% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1 | |
| – | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 149 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 3 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3019 | |
| Common | 184 | 5.7% |
| Greek | 3 | 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 280 | 9.3% |
| a | 275 | 9.1% |
| i | 266 | 8.8% |
| o | 226 | 7.5% |
| c | 224 | 7.4% |
| r | 201 | 6.7% |
| s | 198 | 6.6% |
| t | 174 | 5.8% |
| l | 165 | 5.5% |
| u | 149 | 4.9% |
| Other values (39) | 861 |
Common
| Value | Count | Frequency (%) |
| 149 | ||
| 0 | 7 | 3.8% |
| 1 | 6 | 3.3% |
| 2 | 5 | 2.7% |
| 5 | 4 | 2.2% |
| ) | 3 | 1.6% |
| ( | 3 | 1.6% |
| - | 1 | 0.5% |
| 7 | 1 | 0.5% |
| 9 | 1 | 0.5% |
| Other values (4) | 4 | 2.2% |
Greek
| Value | Count | Frequency (%) |
| φ | 2 | |
| Φ | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3202 | |
| None | 3 | 0.1% |
| Punctuation | 1 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 280 | 8.7% |
| a | 275 | 8.6% |
| i | 266 | 8.3% |
| o | 226 | 7.1% |
| c | 224 | 7.0% |
| r | 201 | 6.3% |
| s | 198 | 6.2% |
| t | 174 | 5.4% |
| l | 165 | 5.2% |
| u | 149 | 4.7% |
| Other values (52) | 1044 |
None
| Value | Count | Frequency (%) |
| φ | 2 | |
| Φ | 1 |
Punctuation
| Value | Count | Frequency (%) |
| – | 1 |
| Distinct | 179 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 227.0 KiB |
| Acinetobacter baumannii is a typically short, almost round, rod-shaped (coccobacillus) Gram-negative bacterium. It is named after the bacteriologist Paul Baumann. It can be an opportunistic pathogen in humans, affecting people with compromised immune systems, and is becoming increasingly important as a hospital-derived (nosocomial) infection. While other species of the genus Acinetobacter are often found in soil samples (leading to the common misconception that A. baumannii is a soil organism, too), it is almost exclusively isolated from hospital environments. Although occasionally it has been found in environmental soil and water samples, its natural habitat is still not known. Bacteria of this genus lack flagella, whip-like structures many bacteria use for locomotion, but exhibit twitching or swarming motility. This may be due to the activity of type IV pili, pole-like structures that can be extended and retracted. Motility in A. baumannii may also be due to the excretion of exopolysaccharide, creating a film of high-molecular-weight sugar chains behind the bacterium to move forward. Clinical microbiologists typically differentiate members of the genus Acinetobacter from other Moraxellaceae by performing an oxidase test, as Acinetobacter spp. are the only members of the Moraxellaceae to lack cytochrome c oxidases.A. baumannii is part of the ACB complex (A. baumannii, A. calcoaceticus, and Acinetobacter genomic species 13TU). It is difficult to determine the specific species of members of the ACB complex and they comprise the most clinically relevant members of the genus. A. baumannii has also been identified as an ESKAPE pathogen (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species), a group of pathogens with a high rate of antibiotic resistance that are responsible for the majority of nosocomial infections.Colloquially, A. baumannii is referred to as "Iraqibacter" due to its seemingly sudden emergence in military treatment facilities during the Iraq War. It has continued to be an issue for veterans and soldiers who served in Iraq and Afghanistan. Multidrug-resistant A. baumannii has spread to civilian hospitals in part due to the transport of infected soldiers through multiple medical facilities. During the COVID-19 pandemic, coinfection with A. baumannii secondary to SARS-CoV-2 infections has been reported multiple times in literature. | 1 |
|---|---|
| Bacteriophage φCb5 is a bacteriophage that infects Caulobacter bacteria and other caulobacteria. The bacteriophage was discovered in 1970, it belongs to the genus Cebevirus of the Steitzviridae family and is the type species of the family. The bacteriophage is widely distributed in the soil, freshwater lakes, streams and seawater, places where caulobacteria inhabit and can be sensitive to salinity. | 1 |
| Streptococcus sobrinus is a Gram-positive, catalase-negative, non-motile, and anaerobic member of the genus Streptococcus. | 1 |
| Treponema is a genus of spiral-shaped bacteria. The major treponeme species of human pathogens is Treponema pallidum, whose subspecies are responsible for diseases such as syphilis, bejel, and yaws. Treponema carateum is the cause of pinta. Treponema paraluiscuniculi is associated with syphilis in rabbits. Treponema succinifaciens has been found in the gut microbiome of traditional rural human populations. | 1 |
| Ureaplasma urealyticum is a bacterium belonging to the genus Ureaplasma and the family Mycoplasmataceae in the order Mycoplasmatales. This family consists of the genera Mycoplasma and Ureaplasma. Its type strain is T960. There are two known biovars of this species; T960 and 27. These strains of bacterium are commonly found in the urogenital tracts of human beings, but overgrowth can lead to infections that cause the patient discomfort. Unlike most bacteria, Ureaplasma urealyticum lacks a cell wall making it unique in physiology and medical treatment. | 1 |
| Other values (174) |
Length
| Max length | 3608 |
|---|---|
| Median length | 742 |
| Mean length | 876.9497207 |
| Min length | 72 |
Characters and Unicode
| Total characters | 156974 |
|---|---|
| Distinct characters | 122 |
| Distinct categories | 13 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 4 ? |
Unique
| Unique | 179 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | Acinetobacter baumannii is a typically short, almost round, rod-shaped (coccobacillus) Gram-negative bacterium. It is named after the bacteriologist Paul Baumann. It can be an opportunistic pathogen in humans, affecting people with compromised immune systems, and is becoming increasingly important as a hospital-derived (nosocomial) infection. While other species of the genus Acinetobacter are often found in soil samples (leading to the common misconception that A. baumannii is a soil organism, too), it is almost exclusively isolated from hospital environments. Although occasionally it has been found in environmental soil and water samples, its natural habitat is still not known. Bacteria of this genus lack flagella, whip-like structures many bacteria use for locomotion, but exhibit twitching or swarming motility. This may be due to the activity of type IV pili, pole-like structures that can be extended and retracted. Motility in A. baumannii may also be due to the excretion of exopolysaccharide, creating a film of high-molecular-weight sugar chains behind the bacterium to move forward. Clinical microbiologists typically differentiate members of the genus Acinetobacter from other Moraxellaceae by performing an oxidase test, as Acinetobacter spp. are the only members of the Moraxellaceae to lack cytochrome c oxidases.A. baumannii is part of the ACB complex (A. baumannii, A. calcoaceticus, and Acinetobacter genomic species 13TU). It is difficult to determine the specific species of members of the ACB complex and they comprise the most clinically relevant members of the genus. A. baumannii has also been identified as an ESKAPE pathogen (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species), a group of pathogens with a high rate of antibiotic resistance that are responsible for the majority of nosocomial infections.Colloquially, A. baumannii is referred to as "Iraqibacter" due to its seemingly sudden emergence in military treatment facilities during the Iraq War. It has continued to be an issue for veterans and soldiers who served in Iraq and Afghanistan. Multidrug-resistant A. baumannii has spread to civilian hospitals in part due to the transport of infected soldiers through multiple medical facilities. During the COVID-19 pandemic, coinfection with A. baumannii secondary to SARS-CoV-2 infections has been reported multiple times in literature. |
|---|---|
| 2nd row | Actinomyces israelii is a species of Gram-positive, rod-shaped bacteria within the genus Actinomyces. Known to live commensally on and within humans, A. israelii is an opportunistic pathogen and a cause of actinomycosis. Many physiologically diverse strains of the species are known to exist, though not all are strict anaerobes. It was named after the German surgeon James Adolf Israel (1848–1926), who studied the organism for the first time in 1878. |
| 3rd row | Agrobacterium radiobacter (more commonly known as Agrobacterium tumefaciens) is the causal agent of crown gall disease (the formation of tumours) in over 140 species of eudicots. It is a rod-shaped, Gram-negative soil bacterium. Symptoms are caused by the insertion of a small segment of DNA (known as the T-DNA, for 'transfer DNA', not to be confused with tRNA that transfers amino acids during protein synthesis), from a plasmid into the plant cell, which is incorporated at a semi-random location into the plant genome. Plant genomes can be engineered by use of Agrobacterium for the delivery of sequences hosted in T-DNA binary vectors. Agrobacterium tumefaciens is an alphaproteobacterium of the family Rhizobiaceae, which includes the nitrogen-fixing legume symbionts. Unlike the nitrogen-fixing symbionts, tumor-producing Agrobacterium species are pathogenic and do not benefit the plant. The wide variety of plants affected by Agrobacterium makes it of great concern to the agriculture industry.Economically, A. tumefaciens is a serious pathogen of walnuts, grape vines, stone fruits, nut trees, sugar beets, horse radish, and rhubarb, and the persistent nature of the tumors or galls caused by the disease make it particularly harmful for perennial crops.Agrobacterium tumefaciens grows optimally at 28 °C. The doubling time can range from 2.5–4h depending on the media, culture format, and level of aeration. At temperatures above 30 °C, A. tumefaciens begins to experience heat shock which is likely to result in errors in cell division. |
| 4th row | Anaplasma is a genus of bacteria of the alphaproteobacterial order Rickettsiales, family Anaplasmataceae. Anaplasma species reside in host blood cells and lead to the disease anaplasmosis. The disease most commonly occurs in areas where competent tick vectors are indigenous, including tropical and semitropical areas of the world for intraerythrocytic Anaplasma spp.Anaplasma species are biologically transmitted by Ixodes deer-tick vectors, and the prototypical species, A. marginale, can be mechanically transmitted by biting flies and iatrogenically with blood-contaminated instruments. One of the major consequences of infection by bovine red blood cells by A. marginale is the development of nonhaemolytic anaemia, thus the absence of hemoglobinuria, which allows clinical differentiation from another major tick-borne disease, bovine babesiosis, caused by Babesia bigemina.Species of veterinary interest include: Anaplasma marginale and Anaplasma centrale in cattle Anaplasma ovis and Anaplasma mesaeterum in sheep and goats Anaplasma phagocytophilum in dogs, cats, and horses (see human granulocytic anaplasmosis) Anaplasma platys in dogs |
| 5th row | Anaplasma phagocytophilum (formerly Ehrlichia phagocytophilum) is a Gram-negative bacterium that is unusual in its tropism to neutrophils. It causes anaplasmosis in sheep and cattle, also known as tick-borne fever and pasture fever, and also causes the zoonotic disease human granulocytic anaplasmosis.A. phagocytophilum is a Gram-negative, obligate bacterium of neutrophils. It causes human granulocytic anaplasmosis, which is a tick-borne rickettsial disease. Because this bacterium invades neutrophils, it has a unique adaptation and pathogenetic mechanism. |
Common Values
| Value | Count | Frequency (%) |
| Acinetobacter baumannii is a typically short, almost round, rod-shaped (coccobacillus) Gram-negative bacterium. It is named after the bacteriologist Paul Baumann. It can be an opportunistic pathogen in humans, affecting people with compromised immune systems, and is becoming increasingly important as a hospital-derived (nosocomial) infection. While other species of the genus Acinetobacter are often found in soil samples (leading to the common misconception that A. baumannii is a soil organism, too), it is almost exclusively isolated from hospital environments. Although occasionally it has been found in environmental soil and water samples, its natural habitat is still not known. Bacteria of this genus lack flagella, whip-like structures many bacteria use for locomotion, but exhibit twitching or swarming motility. This may be due to the activity of type IV pili, pole-like structures that can be extended and retracted. Motility in A. baumannii may also be due to the excretion of exopolysaccharide, creating a film of high-molecular-weight sugar chains behind the bacterium to move forward. Clinical microbiologists typically differentiate members of the genus Acinetobacter from other Moraxellaceae by performing an oxidase test, as Acinetobacter spp. are the only members of the Moraxellaceae to lack cytochrome c oxidases.A. baumannii is part of the ACB complex (A. baumannii, A. calcoaceticus, and Acinetobacter genomic species 13TU). It is difficult to determine the specific species of members of the ACB complex and they comprise the most clinically relevant members of the genus. A. baumannii has also been identified as an ESKAPE pathogen (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species), a group of pathogens with a high rate of antibiotic resistance that are responsible for the majority of nosocomial infections.Colloquially, A. baumannii is referred to as "Iraqibacter" due to its seemingly sudden emergence in military treatment facilities during the Iraq War. It has continued to be an issue for veterans and soldiers who served in Iraq and Afghanistan. Multidrug-resistant A. baumannii has spread to civilian hospitals in part due to the transport of infected soldiers through multiple medical facilities. During the COVID-19 pandemic, coinfection with A. baumannii secondary to SARS-CoV-2 infections has been reported multiple times in literature. | 1 | 0.6% |
| Bacteriophage φCb5 is a bacteriophage that infects Caulobacter bacteria and other caulobacteria. The bacteriophage was discovered in 1970, it belongs to the genus Cebevirus of the Steitzviridae family and is the type species of the family. The bacteriophage is widely distributed in the soil, freshwater lakes, streams and seawater, places where caulobacteria inhabit and can be sensitive to salinity. | 1 | 0.6% |
| Streptococcus sobrinus is a Gram-positive, catalase-negative, non-motile, and anaerobic member of the genus Streptococcus. | 1 | 0.6% |
| Treponema is a genus of spiral-shaped bacteria. The major treponeme species of human pathogens is Treponema pallidum, whose subspecies are responsible for diseases such as syphilis, bejel, and yaws. Treponema carateum is the cause of pinta. Treponema paraluiscuniculi is associated with syphilis in rabbits. Treponema succinifaciens has been found in the gut microbiome of traditional rural human populations. | 1 | 0.6% |
| Ureaplasma urealyticum is a bacterium belonging to the genus Ureaplasma and the family Mycoplasmataceae in the order Mycoplasmatales. This family consists of the genera Mycoplasma and Ureaplasma. Its type strain is T960. There are two known biovars of this species; T960 and 27. These strains of bacterium are commonly found in the urogenital tracts of human beings, but overgrowth can lead to infections that cause the patient discomfort. Unlike most bacteria, Ureaplasma urealyticum lacks a cell wall making it unique in physiology and medical treatment. | 1 | 0.6% |
| Vibrio is a genus of Gram-negative bacteria, possessing a curved-rod (comma) shape, several species of which can cause foodborne infection, usually associated with eating undercooked seafood. Typically found in salt water, Vibrio species are facultative anaerobes that test positive for oxidase and do not form spores. All members of the genus are motile. They are able to have polar or lateral flagellum with or without sheaths. Vibrio species typically possess two chromosomes, which is unusual for bacteria. Each chromosome has a distinct and independent origin of replication, and are conserved together over time in the genus. Recent phylogenies have been constructed based on a suite of genes (multilocus sequence analysis).O. F. Müller (1773, 1786) described eight species of the genus Vibrio (included in Infusoria), three of which were spirilliforms. Some of the other species are today assigned to eukaryote taxa, e.g., to the euglenoid Peranema or to the diatom Bacillaria. However, Vibrio Müller, 1773 became regarded as the name of a zoological genus, and the name of the bacterial genus became Vibrio Pacini, 1854. Filippo Pacini isolated micro-organisms he called "vibrions" from cholera patients in 1854, because of their motility. In Latin "vibrio" means "to quiver".Vibrio spp. are commonly found in marine environments. Marine Vibrio species are highly salt tolerant and can grow in wide range of salinity. S.I. Paul et al. (2021) isolated, characterized, and identified multiple strains of Vibrio species (Vibrio alginolyticus, Vibrio natriegens, Vibrio pelagius, Vibrio azureus) from marine sponges of the Saint Martin's Island Area of the Bay of Bengal, Bangladesh. Where, Vibrio species were found most dominant bacteria in marine environment. | 1 | 0.6% |
| Vibrio cholerae is a species of Gram-negative, facultative anaerobe and comma-shaped bacteria. The bacteria naturally live in brackish or saltwater where they attach themselves easily to the chitin-containing shells of crabs, shrimps, and other shellfish. Some strains of V. cholerae are pathogenic to humans and cause a deadly disease cholera, which can be derived from the consumption of undercooked or raw marine life species.V. cholerae was first described by Félix-Archimède Pouchet in 1849 as some kind of protozoa. Filippo Pacini correctly identified it as a bacterium and from him, the scientific name is adopted. The bacterium as the cause of cholera was discovered by Robert Koch in 1884. Sambhu Nath De isolated the cholera toxin and demonstrated the toxin as the cause of cholera in 1959. The bacterium has a flagellum at one pole and several pili throughout its cell surface. It undergoes respiratory and fermentative metabolism. Two serogroups called O1 and O139 are responsible for cholera outbreaks. Infection is mainly through drinking contaminated water, therefore is linked to sanitation and hygiene. When ingested, it invades the intestinal mucosa can cause diarrhea and vomiting in a host within several hours to 2–3 days of ingestion. Oral rehydration solution and antibiotics such as fluoroquinolones and tetracyclines are the common treatment methods. V. cholerae has two circular DNA. One DNA produces the cholera toxin (CT), a protein that causes profuse, watery diarrhea (known as "rice-water stool"). But the DNA does not directly code for the toxin as the genes for cholera toxin are carried by CTXphi (CTXφ), a temperate bacteriophage (virus). The virus when inserted into the bacterial DNA only produce the toxin. | 1 | 0.6% |
| Vibrio parahaemolyticus is a curved, rod-shaped, Gram-negative bacterium found in the sea and in estuaries which, when ingested, causes gastrointestinal illness in humans. V. parahaemolyticus is oxidase positive, facultatively aerobic, and does not form spores. Like other members of the genus Vibrio, this species is motile, with a single, polar flagellum. | 1 | 0.6% |
| Vibrio vulnificus is a species of Gram-negative, motile, curved rod-shaped (bacillus), pathogenic bacteria of the genus Vibrio. Present in marine environments such as estuaries, brackish ponds, or coastal areas, V. vulnificus is related to V. cholerae, the causative agent of cholera. At least one strain of V. vulnificus is bioluminescent.Infection with V. vulnificus leads to rapidly expanding cellulitis or sepsis.: 279 It was first isolated as a source of disease in 1976. | 1 | 0.6% |
| Wolbachia is a genus of intracellular bacteria that infects mainly arthropod species, including a high proportion of insects, and also some nematodes. It is one of the most common parasitic microbes and is possibly the most common reproductive parasite in the biosphere. Its interactions with its hosts are often complex, and in some cases have evolved to be mutualistic rather than parasitic. Some host species cannot reproduce, or even survive, without Wolbachia colonisation. One study concluded that more than 16% of neotropical insect species carry bacteria of this genus, and as many as 25 to 70% of all insect species are estimated to be potential hosts. | 1 | 0.6% |
| Other values (169) | 169 |
Length
| Value | Count | Frequency (%) |
| the | 1227 | 5.3% |
| of | 900 | 3.9% |
| and | 712 | 3.1% |
| in | 626 | 2.7% |
| is | 624 | 2.7% |
| a | 573 | 2.5% |
| to | 438 | 1.9% |
| are | 249 | 1.1% |
| as | 241 | 1.0% |
| it | 227 | 1.0% |
| Other values (4437) | 17401 |
Most occurring characters
| Value | Count | Frequency (%) |
| 23029 | ||
| e | 14460 | 9.2% |
| a | 11244 | 7.2% |
| i | 11100 | 7.1% |
| t | 10073 | 6.4% |
| s | 9329 | 5.9% |
| o | 9195 | 5.9% |
| n | 8688 | 5.5% |
| r | 7554 | 4.8% |
| c | 6262 | 4.0% |
| Other values (112) | 46040 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 124794 | |
| Space Separator | 23036 | 14.7% |
| Other Punctuation | 3470 | 2.2% |
| Uppercase Letter | 3206 | 2.0% |
| Decimal Number | 1094 | 0.7% |
| Dash Punctuation | 534 | 0.3% |
| Close Punctuation | 300 | 0.2% |
| Open Punctuation | 300 | 0.2% |
| Control | 153 | 0.1% |
| Math Symbol | 68 | < 0.1% |
| Other values (3) | 19 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 14460 | |
| a | 11244 | 9.0% |
| i | 11100 | 8.9% |
| t | 10073 | 8.1% |
| s | 9329 | 7.5% |
| o | 9195 | 7.4% |
| n | 8688 | 7.0% |
| r | 7554 | 6.1% |
| c | 6262 | 5.0% |
| l | 5430 | 4.4% |
| Other values (44) | 31459 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 320 | 10.0% |
| A | 316 | 9.9% |
| B | 290 | 9.0% |
| I | 279 | 8.7% |
| S | 255 | 8.0% |
| C | 214 | 6.7% |
| M | 169 | 5.3% |
| G | 152 | 4.7% |
| P | 139 | 4.3% |
| N | 138 | 4.3% |
| Other values (17) | 934 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 1656 | |
| , | 1535 | |
| " | 88 | 2.5% |
| ' | 53 | 1.5% |
| ; | 47 | 1.4% |
| : | 38 | 1.1% |
| % | 33 | 1.0% |
| / | 18 | 0.5% |
| ? | 1 | < 0.1% |
| § | 1 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 237 | |
| 0 | 236 | |
| 2 | 139 | |
| 9 | 89 | 8.1% |
| 8 | 75 | 6.9% |
| 5 | 73 | 6.7% |
| 4 | 68 | 6.2% |
| 3 | 67 | 6.1% |
| 7 | 62 | 5.7% |
| 6 | 48 | 4.4% |
Math Symbol
| Value | Count | Frequency (%) |
| = | 60 | |
| + | 2 | 2.9% |
| × | 2 | 2.9% |
| < | 1 | 1.5% |
| > | 1 | 1.5% |
| ~ | 1 | 1.5% |
| − | 1 | 1.5% |
Space Separator
| Value | Count | Frequency (%) |
| 23029 | ||
| 5 | < 0.1% | |
| 2 | < 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 475 | |
| – | 53 | 9.9% |
| — | 6 | 1.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 298 | |
| ] | 2 | 0.7% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 298 | |
| [ | 2 | 0.7% |
Control
| Value | Count | Frequency (%) |
| 153 |
Other Symbol
| Value | Count | Frequency (%) |
| ° | 17 |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 1 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 127923 | |
| Common | 28980 | 18.5% |
| Greek | 71 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 14460 | |
| a | 11244 | 8.8% |
| i | 11100 | 8.7% |
| t | 10073 | 7.9% |
| s | 9329 | 7.3% |
| o | 9195 | 7.2% |
| n | 8688 | 6.8% |
| r | 7554 | 5.9% |
| c | 6262 | 4.9% |
| l | 5430 | 4.2% |
| Other values (51) | 34588 |
Common
| Value | Count | Frequency (%) |
| 23029 | ||
| . | 1656 | 5.7% |
| , | 1535 | 5.3% |
| - | 475 | 1.6% |
| ) | 298 | 1.0% |
| ( | 298 | 1.0% |
| 1 | 237 | 0.8% |
| 0 | 236 | 0.8% |
| 153 | 0.5% | |
| 2 | 139 | 0.5% |
| Other values (32) | 924 | 3.2% |
Greek
| Value | Count | Frequency (%) |
| μ | 16 | |
| κ | 8 | |
| φ | 6 | 8.5% |
| β | 6 | 8.5% |
| τ | 4 | 5.6% |
| ς | 4 | 5.6% |
| ό | 4 | 5.6% |
| λ | 4 | 5.6% |
| ρ | 3 | 4.2% |
| σ | 3 | 4.2% |
| Other values (9) | 13 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 156791 | |
| None | 115 | 0.1% |
| Punctuation | 67 | < 0.1% |
| Math Operators | 1 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 23029 | ||
| e | 14460 | 9.2% |
| a | 11244 | 7.2% |
| i | 11100 | 7.1% |
| t | 10073 | 6.4% |
| s | 9329 | 5.9% |
| o | 9195 | 5.9% |
| n | 8688 | 5.5% |
| r | 7554 | 4.8% |
| c | 6262 | 4.0% |
| Other values (74) | 45857 |
Punctuation
| Value | Count | Frequency (%) |
| – | 53 | |
| — | 6 | 9.0% |
| 5 | 7.5% | |
| 2 | 3.0% | |
| ’ | 1 | 1.5% |
None
| Value | Count | Frequency (%) |
| ° | 17 | |
| μ | 16 | 13.9% |
| κ | 8 | 7.0% |
| φ | 6 | 5.2% |
| µ | 6 | 5.2% |
| β | 6 | 5.2% |
| τ | 4 | 3.5% |
| ς | 4 | 3.5% |
| ό | 4 | 3.5% |
| λ | 4 | 3.5% |
| Other values (22) | 40 |
Math Operators
| Value | Count | Frequency (%) |
| − | 1 |
| Distinct | 2 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 19.1 KiB |
| Bacteria | |
|---|---|
| Bacteriophage |
Length
| Max length | 13 |
|---|---|
| Median length | 8 |
| Mean length | 9.480446927 |
| Min length | 8 |
Characters and Unicode
| Total characters | 1697 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Bacteria |
|---|---|
| 2nd row | Bacteria |
| 3rd row | Bacteria |
| 4th row | Bacteria |
| 5th row | Bacteria |
Common Values
| Value | Count | Frequency (%) |
| Bacteria | 126 | |
| Bacteriophage | 53 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| bacteria | 126 | |
| bacteriophage | 53 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 358 | |
| e | 232 | |
| B | 179 | |
| c | 179 | |
| t | 179 | |
| r | 179 | |
| i | 179 | |
| o | 53 | 3.1% |
| p | 53 | 3.1% |
| h | 53 | 3.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1518 | |
| Uppercase Letter | 179 | 10.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 358 | |
| e | 232 | |
| c | 179 | |
| t | 179 | |
| r | 179 | |
| i | 179 | |
| o | 53 | 3.5% |
| p | 53 | 3.5% |
| h | 53 | 3.5% |
| g | 53 | 3.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 179 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1697 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 358 | |
| e | 232 | |
| B | 179 | |
| c | 179 | |
| t | 179 | |
| r | 179 | |
| i | 179 | |
| o | 53 | 3.1% |
| p | 53 | 3.1% |
| h | 53 | 3.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1697 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 358 | |
| e | 232 | |
| B | 179 | |
| c | 179 | |
| t | 179 | |
| r | 179 | |
| i | 179 | |
| o | 53 | 3.1% |
| p | 53 | 3.1% |
| h | 53 | 3.1% |
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.4 KiB |
| en |
|---|
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 358 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | en |
|---|---|
| 2nd row | en |
| 3rd row | en |
| 4th row | en |
| 5th row | en |
Common Values
| Value | Count | Frequency (%) |
| en | 179 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| en | 179 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 179 | |
| n | 179 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 358 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 179 | |
| n | 179 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 358 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 179 | |
| n | 179 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 358 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 179 | |
| n | 179 |
| Distinct | 179 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 144.9 KiB |
| acinetobacter baumannii typically short almost round rodshaped coccobacillus gramnegative bacterium named bacteriologist paul baumann opportunistic pathogen human affecting people compromised immune system becoming increasingly important hospitalderived nosocomial infection specie genus acinetobacter often found soil sample leading common misconception baumannii soil organism almost exclusively isolated hospital environment although occasionally found environmental soil water sample natural habitat still known bacteria genus lack flagellum whiplike structure many bacteria use locomotion exhibit twitching swarming motility may due activity type iv pili polelike structure extended retracted motility baumannii may also due excretion exopolysaccharide creating film highmolecularweight sugar chain behind bacterium move forward clinical microbiologist typically differentiate member genus acinetobacter moraxellaceae performing oxidase test acinetobacter spp member moraxellaceae lack cytochrome c oxidasesa baumannii part acb complex baumannii calcoaceticus acinetobacter genomic specie 13tu difficult determine specific specie member acb complex comprise clinically relevant member genus baumannii also identified eskape pathogen enterococcus faecium staphylococcus aureus klebsiella pneumoniae acinetobacter baumannii pseudomonas aeruginosa enterobacter specie group pathogen high rate antibiotic resistance responsible majority nosocomial infectionscolloquially baumannii referred iraqibacter due seemingly sudden emergence military treatment facility iraq war continued issue veteran soldier served iraq afghanistan multidrugresistant baumannii spread civilian hospital part due transport infected soldier multiple medical facility covid19 pandemic coinfection baumannii secondary sarscov2 infection reported multiple time literature | 1 |
|---|---|
| bacteriophage φcb5 bacteriophage infects caulobacter bacteria caulobacteria bacteriophage discovered 1970 belongs genus cebevirus steitzviridae family type specie family bacteriophage widely distributed soil freshwater lake stream seawater place caulobacteria inhabit sensitive salinity | 1 |
| streptococcus sobrinus grampositive catalasenegative nonmotile anaerobic member genus streptococcus | 1 |
| treponema genus spiralshaped bacteria major treponeme specie human pathogen treponema pallidum whose subspecies responsible disease syphilis bejel yaw treponema carateum cause pinta treponema paraluiscuniculi associated syphilis rabbit treponema succinifaciens found gut microbiome traditional rural human population | 1 |
| ureaplasma urealyticum bacterium belonging genus ureaplasma family mycoplasmataceae order mycoplasmatales family consists genus mycoplasma ureaplasma type strain t960 two known biovars specie t960 27 strain bacterium commonly found urogenital tract human being overgrowth lead infection cause patient discomfort unlike bacteria ureaplasma urealyticum lack cell wall making unique physiology medical treatment | 1 |
| Other values (174) |
Length
| Max length | 2645 |
|---|---|
| Median length | 575 |
| Mean length | 652.3743017 |
| Min length | 53 |
Characters and Unicode
| Total characters | 116775 |
|---|---|
| Distinct characters | 66 |
| Distinct categories | 4 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 179 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | acinetobacter baumannii typically short almost round rodshaped coccobacillus gramnegative bacterium named bacteriologist paul baumann opportunistic pathogen human affecting people compromised immune system becoming increasingly important hospitalderived nosocomial infection specie genus acinetobacter often found soil sample leading common misconception baumannii soil organism almost exclusively isolated hospital environment although occasionally found environmental soil water sample natural habitat still known bacteria genus lack flagellum whiplike structure many bacteria use locomotion exhibit twitching swarming motility may due activity type iv pili polelike structure extended retracted motility baumannii may also due excretion exopolysaccharide creating film highmolecularweight sugar chain behind bacterium move forward clinical microbiologist typically differentiate member genus acinetobacter moraxellaceae performing oxidase test acinetobacter spp member moraxellaceae lack cytochrome c oxidasesa baumannii part acb complex baumannii calcoaceticus acinetobacter genomic specie 13tu difficult determine specific specie member acb complex comprise clinically relevant member genus baumannii also identified eskape pathogen enterococcus faecium staphylococcus aureus klebsiella pneumoniae acinetobacter baumannii pseudomonas aeruginosa enterobacter specie group pathogen high rate antibiotic resistance responsible majority nosocomial infectionscolloquially baumannii referred iraqibacter due seemingly sudden emergence military treatment facility iraq war continued issue veteran soldier served iraq afghanistan multidrugresistant baumannii spread civilian hospital part due transport infected soldier multiple medical facility covid19 pandemic coinfection baumannii secondary sarscov2 infection reported multiple time literature |
|---|---|
| 2nd row | actinomyces israelii specie grampositive rodshaped bacteria within genus actinomyces known live commensally within human israelii opportunistic pathogen cause actinomycosis many physiologically diverse strain specie known exist though strict anaerobe named german surgeon james adolf israel 18481926 studied organism first time 1878 |
| 3rd row | agrobacterium radiobacter commonly known agrobacterium tumefaciens causal agent crown gall disease formation tumour 140 specie eudicots rodshaped gramnegative soil bacterium symptom caused insertion small segment dna known tdna transfer dna confused trna transfer amino acid protein synthesis plasmid plant cell incorporated semirandom location plant genome plant genome engineered use agrobacterium delivery sequence hosted tdna binary vector agrobacterium tumefaciens alphaproteobacterium family rhizobiaceae includes nitrogenfixing legume symbionts unlike nitrogenfixing symbionts tumorproducing agrobacterium specie pathogenic benefit plant wide variety plant affected agrobacterium make great concern agriculture industryeconomically tumefaciens serious pathogen walnut grape vine stone fruit nut tree sugar beet horse radish rhubarb persistent nature tumor gall caused disease make particularly harmful perennial cropsagrobacterium tumefaciens grows optimally 28 c doubling time range 254h depending medium culture format level aeration temperature 30 c tumefaciens begin experience heat shock likely result error cell division |
| 4th row | anaplasma genus bacteria alphaproteobacterial order rickettsiales family anaplasmataceae anaplasma specie reside host blood cell lead disease anaplasmosis disease commonly occurs area competent tick vector indigenous including tropical semitropical area world intraerythrocytic anaplasma sppanaplasma specie biologically transmitted ixodes deertick vector prototypical specie marginale mechanically transmitted biting fly iatrogenically bloodcontaminated instrument one major consequence infection bovine red blood cell marginale development nonhaemolytic anaemia thus absence hemoglobinuria allows clinical differentiation another major tickborne disease bovine babesiosis caused babesia bigeminaspecies veterinary interest include anaplasma marginale anaplasma centrale cattle anaplasma ovis anaplasma mesaeterum sheep goat anaplasma phagocytophilum dog cat horse see human granulocytic anaplasmosis anaplasma platy dog |
| 5th row | anaplasma phagocytophilum formerly ehrlichia phagocytophilum gramnegative bacterium unusual tropism neutrophil cause anaplasmosis sheep cattle also known tickborne fever pasture fever also cause zoonotic disease human granulocytic anaplasmosisa phagocytophilum gramnegative obligate bacterium neutrophil cause human granulocytic anaplasmosis tickborne rickettsial disease bacterium invades neutrophil unique adaptation pathogenetic mechanism |
Common Values
| Value | Count | Frequency (%) |
| acinetobacter baumannii typically short almost round rodshaped coccobacillus gramnegative bacterium named bacteriologist paul baumann opportunistic pathogen human affecting people compromised immune system becoming increasingly important hospitalderived nosocomial infection specie genus acinetobacter often found soil sample leading common misconception baumannii soil organism almost exclusively isolated hospital environment although occasionally found environmental soil water sample natural habitat still known bacteria genus lack flagellum whiplike structure many bacteria use locomotion exhibit twitching swarming motility may due activity type iv pili polelike structure extended retracted motility baumannii may also due excretion exopolysaccharide creating film highmolecularweight sugar chain behind bacterium move forward clinical microbiologist typically differentiate member genus acinetobacter moraxellaceae performing oxidase test acinetobacter spp member moraxellaceae lack cytochrome c oxidasesa baumannii part acb complex baumannii calcoaceticus acinetobacter genomic specie 13tu difficult determine specific specie member acb complex comprise clinically relevant member genus baumannii also identified eskape pathogen enterococcus faecium staphylococcus aureus klebsiella pneumoniae acinetobacter baumannii pseudomonas aeruginosa enterobacter specie group pathogen high rate antibiotic resistance responsible majority nosocomial infectionscolloquially baumannii referred iraqibacter due seemingly sudden emergence military treatment facility iraq war continued issue veteran soldier served iraq afghanistan multidrugresistant baumannii spread civilian hospital part due transport infected soldier multiple medical facility covid19 pandemic coinfection baumannii secondary sarscov2 infection reported multiple time literature | 1 | 0.6% |
| bacteriophage φcb5 bacteriophage infects caulobacter bacteria caulobacteria bacteriophage discovered 1970 belongs genus cebevirus steitzviridae family type specie family bacteriophage widely distributed soil freshwater lake stream seawater place caulobacteria inhabit sensitive salinity | 1 | 0.6% |
| streptococcus sobrinus grampositive catalasenegative nonmotile anaerobic member genus streptococcus | 1 | 0.6% |
| treponema genus spiralshaped bacteria major treponeme specie human pathogen treponema pallidum whose subspecies responsible disease syphilis bejel yaw treponema carateum cause pinta treponema paraluiscuniculi associated syphilis rabbit treponema succinifaciens found gut microbiome traditional rural human population | 1 | 0.6% |
| ureaplasma urealyticum bacterium belonging genus ureaplasma family mycoplasmataceae order mycoplasmatales family consists genus mycoplasma ureaplasma type strain t960 two known biovars specie t960 27 strain bacterium commonly found urogenital tract human being overgrowth lead infection cause patient discomfort unlike bacteria ureaplasma urealyticum lack cell wall making unique physiology medical treatment | 1 | 0.6% |
| vibrio genus gramnegative bacteria possessing curvedrod comma shape several specie cause foodborne infection usually associated eating undercooked seafood typically found salt water vibrio specie facultative anaerobe test positive oxidase form spore member genus motile able polar lateral flagellum without sheath vibrio specie typically posse two chromosome unusual bacteria chromosome distinct independent origin replication conserved together time genus recent phylogeny constructed based suite gene multilocus sequence analysiso f müller 1773 1786 described eight specie genus vibrio included infusoria three spirilliforms specie today assigned eukaryote taxon eg euglenoid peranema diatom bacillaria however vibrio müller 1773 became regarded name zoological genus name bacterial genus became vibrio pacini 1854 filippo pacini isolated microorganism called vibrion cholera patient 1854 motility latin vibrio mean quivervibrio spp commonly found marine environment marine vibrio specie highly salt tolerant grow wide range salinity si paul et al 2021 isolated characterized identified multiple strain vibrio specie vibrio alginolyticus vibrio natriegens vibrio pelagius vibrio azureus marine sponge saint martin island area bay bengal bangladesh vibrio specie found dominant bacteria marine environment | 1 | 0.6% |
| vibrio cholerae specie gramnegative facultative anaerobe commashaped bacteria bacteria naturally live brackish saltwater attach easily chitincontaining shell crab shrimp shellfish strain v cholerae pathogenic human cause deadly disease cholera derived consumption undercooked raw marine life speciesv cholerae first described félixarchimède pouchet 1849 kind protozoa filippo pacini correctly identified bacterium scientific name adopted bacterium cause cholera discovered robert koch 1884 sambhu nath de isolated cholera toxin demonstrated toxin cause cholera 1959 bacterium flagellum one pole several pili throughout cell surface undergoes respiratory fermentative metabolism two serogroups called o1 o139 responsible cholera outbreak infection mainly drinking contaminated water therefore linked sanitation hygiene ingested invades intestinal mucosa cause diarrhea vomiting host within several hour 23 day ingestion oral rehydration solution antibiotic fluoroquinolones tetracycline common treatment method v cholerae two circular dna one dna produce cholera toxin ct protein cause profuse watery diarrhea known ricewater stool dna directly code toxin gene cholera toxin carried ctxphi ctxφ temperate bacteriophage virus virus inserted bacterial dna produce toxin | 1 | 0.6% |
| vibrio parahaemolyticus curved rodshaped gramnegative bacterium found sea estuary ingested cause gastrointestinal illness human v parahaemolyticus oxidase positive facultatively aerobic form spore like member genus vibrio specie motile single polar flagellum | 1 | 0.6% |
| vibrio vulnificus specie gramnegative motile curved rodshaped bacillus pathogenic bacteria genus vibrio present marine environment estuary brackish pond coastal area v vulnificus related v cholerae causative agent cholera least one strain v vulnificus bioluminescentinfection v vulnificus lead rapidly expanding cellulitis sepsis 279 first isolated source disease 1976 | 1 | 0.6% |
| wolbachia genus intracellular bacteria infects mainly arthropod specie including high proportion insect also nematode one common parasitic microbe possibly common reproductive parasite biosphere interaction host often complex case evolved mutualistic rather parasitic host specie cannot reproduce even survive without wolbachia colonisation one study concluded 16 neotropical insect specie carry bacteria genus many 25 70 insect specie estimated potential host | 1 | 0.6% |
| Other values (169) | 169 |
Length
| Value | Count | Frequency (%) |
| specie | 195 | 1.4% |
| bacteria | 147 | 1.0% |
| genus | 135 | 1.0% |
| human | 134 | 0.9% |
| cause | 125 | 0.9% |
| bacterium | 118 | 0.8% |
| infection | 117 | 0.8% |
| cell | 110 | 0.8% |
| phage | 100 | 0.7% |
| also | 96 | 0.7% |
| Other values (4023) | 12870 |
Most occurring characters
| Value | Count | Frequency (%) |
| 13968 | ||
| e | 11940 | 10.2% |
| i | 9184 | 7.9% |
| a | 8898 | 7.6% |
| t | 7136 | 6.1% |
| o | 6932 | 5.9% |
| n | 6787 | 5.8% |
| r | 6676 | 5.7% |
| s | 6243 | 5.3% |
| c | 6129 | 5.2% |
| Other values (56) | 32882 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 101712 | |
| Space Separator | 13968 | 12.0% |
| Decimal Number | 1094 | 0.9% |
| Connector Punctuation | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 11940 | |
| i | 9184 | 9.0% |
| a | 8898 | 8.7% |
| t | 7136 | 7.0% |
| o | 6932 | 6.8% |
| n | 6787 | 6.7% |
| r | 6676 | 6.6% |
| s | 6243 | 6.1% |
| c | 6129 | 6.0% |
| l | 5422 | 5.3% |
| Other values (44) | 26365 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 237 | |
| 0 | 236 | |
| 2 | 139 | |
| 9 | 89 | 8.1% |
| 8 | 75 | 6.9% |
| 5 | 73 | 6.7% |
| 4 | 68 | 6.2% |
| 3 | 67 | 6.1% |
| 7 | 62 | 5.7% |
| 6 | 48 | 4.4% |
Space Separator
| Value | Count | Frequency (%) |
| 13968 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 101635 | |
| Common | 15069 | 12.9% |
| Greek | 71 | 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 11940 | |
| i | 9184 | 9.0% |
| a | 8898 | 8.8% |
| t | 7136 | 7.0% |
| o | 6932 | 6.8% |
| n | 6787 | 6.7% |
| r | 6676 | 6.6% |
| s | 6243 | 6.1% |
| c | 6129 | 6.0% |
| l | 5422 | 5.3% |
| Other values (25) | 26288 |
Greek
| Value | Count | Frequency (%) |
| μ | 16 | |
| κ | 8 | |
| φ | 7 | |
| β | 6 | 8.5% |
| τ | 4 | 5.6% |
| λ | 4 | 5.6% |
| ό | 4 | 5.6% |
| ς | 4 | 5.6% |
| σ | 3 | 4.2% |
| ρ | 3 | 4.2% |
| Other values (8) | 12 |
Common
| Value | Count | Frequency (%) |
| 13968 | ||
| 1 | 237 | 1.6% |
| 0 | 236 | 1.6% |
| 2 | 139 | 0.9% |
| 9 | 89 | 0.6% |
| 8 | 75 | 0.5% |
| 5 | 73 | 0.5% |
| 4 | 68 | 0.5% |
| 3 | 67 | 0.4% |
| 7 | 62 | 0.4% |
| Other values (3) | 55 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 116680 | |
| None | 95 | 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 13968 | ||
| e | 11940 | 10.2% |
| i | 9184 | 7.9% |
| a | 8898 | 7.6% |
| t | 7136 | 6.1% |
| o | 6932 | 5.9% |
| n | 6787 | 5.8% |
| r | 6676 | 5.7% |
| s | 6243 | 5.4% |
| c | 6129 | 5.3% |
| Other values (28) | 32787 |
None
| Value | Count | Frequency (%) |
| μ | 16 | |
| κ | 8 | 8.4% |
| φ | 7 | 7.4% |
| µ | 6 | 6.3% |
| β | 6 | 6.3% |
| τ | 4 | 4.2% |
| ó | 4 | 4.2% |
| λ | 4 | 4.2% |
| ό | 4 | 4.2% |
| ς | 4 | 4.2% |
| Other values (18) | 32 |
| Distinct | 133 |
|---|---|
| Distinct (%) | 74.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 129.6536313 |
| Minimum | 9 |
|---|---|
| Maximum | 530 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 9 |
|---|---|
| 5-th percentile | 29.9 |
| Q1 | 55 |
| median | 102 |
| Q3 | 177 |
| 95-th percentile | 321.6 |
| Maximum | 530 |
| Range | 521 |
| Interquartile range (IQR) | 122 |
Descriptive statistics
| Standard deviation | 98.10160192 |
|---|---|
| Coefficient of variation (CV) | 0.7566436894 |
| Kurtosis | 1.615886312 |
| Mean | 129.6536313 |
| Median Absolute Deviation (MAD) | 54 |
| Skewness | 1.299885631 |
| Sum | 23208 |
| Variance | 9623.924299 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 55 | 4 | 2.2% |
| 72 | 3 | 1.7% |
| 47 | 3 | 1.7% |
| 78 | 3 | 1.7% |
| 41 | 3 | 1.7% |
| 170 | 3 | 1.7% |
| 106 | 3 | 1.7% |
| 59 | 3 | 1.7% |
| 46 | 3 | 1.7% |
| 86 | 3 | 1.7% |
| Other values (123) | 148 |
| Value | Count | Frequency (%) |
| 9 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 15 | 2 | |
| 19 | 1 | |
| 20 | 1 | |
| 29 | 2 | |
| 30 | 1 | |
| 31 | 2 | |
| 32 | 1 |
| Value | Count | Frequency (%) |
| 530 | 1 | |
| 450 | 1 | |
| 425 | 1 | |
| 400 | 1 | |
| 358 | 1 | |
| 354 | 1 | |
| 351 | 1 | |
| 336 | 1 | |
| 327 | 1 | |
| 321 | 1 |
| Distinct | 168 |
|---|---|
| Distinct (%) | 93.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 748.2960894 |
| Minimum | 64 |
|---|---|
| Maximum | 3079 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 64 |
|---|---|
| 5-th percentile | 174.7 |
| Q1 | 318 |
| median | 575 |
| Q3 | 1042 |
| 95-th percentile | 1861.5 |
| Maximum | 3079 |
| Range | 3015 |
| Interquartile range (IQR) | 724 |
Descriptive statistics
| Standard deviation | 561.5916251 |
|---|---|
| Coefficient of variation (CV) | 0.7504938661 |
| Kurtosis | 1.577051016 |
| Mean | 748.2960894 |
| Median Absolute Deviation (MAD) | 295 |
| Skewness | 1.294963601 |
| Sum | 133945 |
| Variance | 315385.1534 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 373 | 2 | 1.1% |
| 298 | 2 | 1.1% |
| 1316 | 2 | 1.1% |
| 214 | 2 | 1.1% |
| 229 | 2 | 1.1% |
| 331 | 2 | 1.1% |
| 411 | 2 | 1.1% |
| 310 | 2 | 1.1% |
| 318 | 2 | 1.1% |
| 1046 | 2 | 1.1% |
| Other values (158) | 159 |
| Value | Count | Frequency (%) |
| 64 | 1 | |
| 77 | 1 | |
| 100 | 1 | |
| 101 | 1 | |
| 109 | 1 | |
| 125 | 1 | |
| 131 | 1 | |
| 171 | 1 | |
| 172 | 1 | |
| 175 | 1 |
| Value | Count | Frequency (%) |
| 3079 | 1 | |
| 2552 | 1 | |
| 2350 | 1 | |
| 2281 | 1 | |
| 2118 | 1 | |
| 1975 | 1 | |
| 1906 | 1 | |
| 1876 | 1 | |
| 1875 | 1 | |
| 1860 | 1 |
sentence_count
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 26 |
|---|---|
| Distinct (%) | 14.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.25139665 |
| Minimum | 2 |
|---|---|
| Maximum | 46 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 5 |
| median | 8 |
| Q3 | 14 |
| 95-th percentile | 22.2 |
| Maximum | 46 |
| Range | 44 |
| Interquartile range (IQR) | 9 |
Descriptive statistics
| Standard deviation | 7.371249628 |
|---|---|
| Coefficient of variation (CV) | 0.7190483288 |
| Kurtosis | 3.578145919 |
| Mean | 10.25139665 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 1.627951936 |
| Sum | 1835 |
| Variance | 54.33532107 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 19 | 10.6% |
| 6 | 18 | 10.1% |
| 7 | 16 | 8.9% |
| 5 | 14 | 7.8% |
| 3 | 14 | 7.8% |
| 9 | 12 | 6.7% |
| 10 | 11 | 6.1% |
| 12 | 8 | 4.5% |
| 8 | 7 | 3.9% |
| 18 | 6 | 3.4% |
| Other values (16) | 54 |
| Value | Count | Frequency (%) |
| 2 | 6 | 3.4% |
| 3 | 14 | |
| 4 | 19 | |
| 5 | 14 | |
| 6 | 18 | |
| 7 | 16 | |
| 8 | 7 | 3.9% |
| 9 | 12 | |
| 10 | 11 | |
| 11 | 5 | 2.8% |
| Value | Count | Frequency (%) |
| 46 | 1 | 0.6% |
| 35 | 3 | |
| 27 | 1 | 0.6% |
| 26 | 1 | 0.6% |
| 25 | 2 | 1.1% |
| 24 | 1 | 0.6% |
| 22 | 3 | |
| 21 | 4 | |
| 20 | 6 | |
| 19 | 5 |
avg_word_length
Real number (ℝ≥0)
| Distinct | 175 |
|---|---|
| Distinct (%) | 97.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.867284924 |
| Minimum | 4.727272727 |
|---|---|
| Maximum | 10.1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 4.727272727 |
|---|---|
| 5-th percentile | 5.109244662 |
| Q1 | 5.499084249 |
| median | 5.813559322 |
| Q3 | 6.122863248 |
| 95-th percentile | 6.707590569 |
| Maximum | 10.1 |
| Range | 5.372727273 |
| Interquartile range (IQR) | 0.6237789988 |
Descriptive statistics
| Standard deviation | 0.5755453755 |
|---|---|
| Coefficient of variation (CV) | 0.09809398775 |
| Kurtosis | 15.52984099 |
| Mean | 5.867284924 |
| Median Absolute Deviation (MAD) | 0.3153908239 |
| Skewness | 2.439832391 |
| Sum | 1050.244001 |
| Variance | 0.3312524792 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7 | 3 | 1.7% |
| 5.6 | 2 | 1.1% |
| 6.326530612 | 2 | 1.1% |
| 5.983050847 | 1 | 0.6% |
| 5.875 | 1 | 0.6% |
| 5.966101695 | 1 | 0.6% |
| 5.488372093 | 1 | 0.6% |
| 5.498168498 | 1 | 0.6% |
| 5.486988848 | 1 | 0.6% |
| 6.019607843 | 1 | 0.6% |
| Other values (165) | 165 |
| Value | Count | Frequency (%) |
| 4.727272727 | 1 | |
| 4.807692308 | 1 | |
| 4.872340426 | 1 | |
| 4.934782609 | 1 | |
| 4.9375 | 1 | |
| 5.048611111 | 1 | |
| 5.05 | 1 | |
| 5.103092784 | 1 | |
| 5.106145251 | 1 | |
| 5.109589041 | 1 |
| Value | Count | Frequency (%) |
| 10.1 | 1 | 0.6% |
| 7.266666667 | 1 | 0.6% |
| 7.111111111 | 1 | 0.6% |
| 7.090909091 | 1 | 0.6% |
| 7.035460993 | 1 | 0.6% |
| 7 | 3 | |
| 6.756756757 | 1 | 0.6% |
| 6.70212766 | 1 | 0.6% |
| 6.684931507 | 1 | 0.6% |
| 6.666666667 | 1 | 0.6% |
| Distinct | 152 |
|---|---|
| Distinct (%) | 84.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.64943597 |
| Minimum | 4.5 |
|---|---|
| Maximum | 26.6 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 4.5 |
|---|---|
| 5-th percentile | 6.666666667 |
| Q1 | 9.527777778 |
| median | 11.875 |
| Q3 | 15.66666667 |
| 95-th percentile | 19.92 |
| Maximum | 26.6 |
| Range | 22.1 |
| Interquartile range (IQR) | 6.138888889 |
Descriptive statistics
| Standard deviation | 4.391931159 |
|---|---|
| Coefficient of variation (CV) | 0.3472037147 |
| Kurtosis | 0.205744776 |
| Mean | 12.64943597 |
| Median Absolute Deviation (MAD) | 3.041666667 |
| Skewness | 0.6626630912 |
| Sum | 2264.249039 |
| Variance | 19.2890593 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 17 | 4 | 2.2% |
| 7.5 | 3 | 1.7% |
| 13.66666667 | 3 | 1.7% |
| 11 | 3 | 1.7% |
| 12.66666667 | 2 | 1.1% |
| 8.166666667 | 2 | 1.1% |
| 9.833333333 | 2 | 1.1% |
| 18 | 2 | 1.1% |
| 8 | 2 | 1.1% |
| 7.4 | 2 | 1.1% |
| Other values (142) | 154 |
| Value | Count | Frequency (%) |
| 4.5 | 1 | |
| 5 | 1 | |
| 5.5 | 1 | |
| 5.8 | 1 | |
| 5.846153846 | 1 | |
| 6.125 | 1 | |
| 6.333333333 | 1 | |
| 6.6 | 1 | |
| 6.666666667 | 2 | |
| 6.875 | 1 |
| Value | Count | Frequency (%) |
| 26.6 | 1 | |
| 25.5 | 1 | |
| 24.5 | 1 | |
| 23.86666667 | 1 | |
| 23.4 | 1 | |
| 22.28571429 | 1 | |
| 22.125 | 1 | |
| 22.08333333 | 1 | |
| 20.4 | 1 | |
| 19.86666667 | 1 |
name_word_count
Categorical
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 3 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 2 | |
|---|---|
| 1 | |
| 3 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2 |
|---|---|
| 2nd row | 2 |
| 3rd row | 2 |
| 4th row | 1 |
| 5th row | 2 |
Common Values
| Value | Count | Frequency (%) |
| 2 | 123 | |
| 1 | 43 | 24.0% |
| 3 | 13 | 7.3% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 2 | 123 | |
| 1 | 43 | 24.0% |
| 3 | 13 | 7.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 123 | |
| 1 | 43 | 24.0% |
| 3 | 13 | 7.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 123 | |
| 1 | 43 | 24.0% |
| 3 | 13 | 7.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 123 | |
| 1 | 43 | 24.0% |
| 3 | 13 | 7.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 123 | |
| 1 | 43 | 24.0% |
| 3 | 13 | 7.3% |
name_char_count
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 24 |
|---|---|
| Distinct (%) | 13.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17.07821229 |
| Minimum | 6 |
|---|---|
| Maximum | 30 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 6 |
|---|---|
| 5-th percentile | 8 |
| Q1 | 13 |
| median | 18 |
| Q3 | 21 |
| 95-th percentile | 24.1 |
| Maximum | 30 |
| Range | 24 |
| Interquartile range (IQR) | 8 |
Descriptive statistics
| Standard deviation | 5.108323802 |
|---|---|
| Coefficient of variation (CV) | 0.2991134971 |
| Kurtosis | -0.6184960472 |
| Mean | 17.07821229 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.2402103769 |
| Sum | 3057 |
| Variance | 26.09497207 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 18 | 22 | 12.3% |
| 20 | 18 | 10.1% |
| 21 | 15 | 8.4% |
| 19 | 15 | 8.4% |
| 17 | 12 | 6.7% |
| 10 | 10 | 5.6% |
| 12 | 9 | 5.0% |
| 23 | 9 | 5.0% |
| 22 | 7 | 3.9% |
| 8 | 7 | 3.9% |
| Other values (14) | 55 |
| Value | Count | Frequency (%) |
| 6 | 1 | 0.6% |
| 7 | 3 | 1.7% |
| 8 | 7 | |
| 9 | 7 | |
| 10 | 10 | |
| 11 | 6 | |
| 12 | 9 | |
| 13 | 6 | |
| 14 | 5 | |
| 15 | 6 |
| Value | Count | Frequency (%) |
| 30 | 1 | 0.6% |
| 29 | 1 | 0.6% |
| 27 | 1 | 0.6% |
| 26 | 2 | 1.1% |
| 25 | 4 | 2.2% |
| 24 | 6 | 3.4% |
| 23 | 9 | |
| 22 | 7 | 3.9% |
| 21 | 15 | |
| 20 | 18 |
| Distinct | 29 |
|---|---|
| Distinct (%) | 16.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.568901304 |
| Minimum | 3.5 |
|---|---|
| Maximum | 19 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 3.5 |
|---|---|
| 5-th percentile | 6 |
| Q1 | 8.5 |
| median | 9.5 |
| Q3 | 11 |
| 95-th percentile | 13 |
| Maximum | 19 |
| Range | 15.5 |
| Interquartile range (IQR) | 2.5 |
Descriptive statistics
| Standard deviation | 2.255434823 |
|---|---|
| Coefficient of variation (CV) | 0.2357046803 |
| Kurtosis | 2.158351327 |
| Mean | 9.568901304 |
| Median Absolute Deviation (MAD) | 1.5 |
| Skewness | 0.423882704 |
| Sum | 1712.833333 |
| Variance | 5.086986239 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 9 | 27 | |
| 10 | 25 | |
| 10.5 | 13 | 7.3% |
| 9.5 | 12 | 6.7% |
| 8 | 12 | 6.7% |
| 11 | 11 | 6.1% |
| 12 | 11 | 6.1% |
| 8.5 | 11 | 6.1% |
| 7 | 8 | 4.5% |
| 13 | 8 | 4.5% |
| Other values (19) | 41 |
| Value | Count | Frequency (%) |
| 3.5 | 1 | 0.6% |
| 3.666666667 | 1 | 0.6% |
| 4 | 1 | 0.6% |
| 5 | 2 | 1.1% |
| 5.5 | 1 | 0.6% |
| 5.666666667 | 1 | 0.6% |
| 6 | 6 | |
| 6.333333333 | 2 | 1.1% |
| 6.666666667 | 2 | 1.1% |
| 7 | 8 |
| Value | Count | Frequency (%) |
| 19 | 1 | 0.6% |
| 18 | 1 | 0.6% |
| 15 | 1 | 0.6% |
| 14 | 1 | 0.6% |
| 13.5 | 1 | 0.6% |
| 13 | 8 | |
| 12.5 | 4 | 2.2% |
| 12 | 11 | |
| 11.5 | 7 | |
| 11 | 11 |
| Distinct | 156 |
|---|---|
| Distinct (%) | 87.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.06782696168 |
| Minimum | -0.2825 |
|---|---|
| Maximum | 0.55 |
| Zeros | 17 |
| Zeros (%) | 9.5% |
| Negative | 42 |
| Negative (%) | 23.5% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | -0.2825 |
|---|---|
| 5-th percentile | -0.1023212121 |
| Q1 | 0 |
| median | 0.05583333333 |
| Q3 | 0.115 |
| 95-th percentile | 0.3116666667 |
| Maximum | 0.55 |
| Range | 0.8325 |
| Interquartile range (IQR) | 0.115 |
Descriptive statistics
| Standard deviation | 0.1275124711 |
|---|---|
| Coefficient of variation (CV) | 1.879967316 |
| Kurtosis | 2.931885608 |
| Mean | 0.06782696168 |
| Median Absolute Deviation (MAD) | 0.05587454212 |
| Skewness | 1.008554241 |
| Sum | 12.14102614 |
| Variance | 0.01625943028 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 17 | 9.5% |
| 0.4 | 2 | 1.1% |
| 0.115 | 2 | 1.1% |
| -0.0625 | 2 | 1.1% |
| 0.05 | 2 | 1.1% |
| 0.06875 | 2 | 1.1% |
| -0.04166666667 | 2 | 1.1% |
| -0.1333333333 | 2 | 1.1% |
| -0.01956989247 | 1 | 0.6% |
| 0.05870629371 | 1 | 0.6% |
| Other values (146) | 146 |
| Value | Count | Frequency (%) |
| -0.2825 | 1 | |
| -0.2776785714 | 1 | |
| -0.2125 | 1 | |
| -0.1625 | 1 | |
| -0.1416666667 | 1 | |
| -0.1375 | 1 | |
| -0.1333333333 | 2 | |
| -0.1166666667 | 1 | |
| -0.1007272727 | 1 | |
| -0.09166666667 | 1 |
| Value | Count | Frequency (%) |
| 0.55 | 1 | |
| 0.525 | 1 | |
| 0.5 | 1 | |
| 0.4666666667 | 1 | |
| 0.4 | 2 | |
| 0.3875 | 1 | |
| 0.375 | 1 | |
| 0.3333333333 | 1 | |
| 0.3092592593 | 1 | |
| 0.2857142857 | 1 |
| Distinct | 11 |
|---|---|
| Distinct (%) | 6.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.273743017 |
| Minimum | 0 |
|---|---|
| Maximum | 14 |
| Zeros | 92 |
| Zeros (%) | 51.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 2 |
| 95-th percentile | 4.1 |
| Maximum | 14 |
| Range | 14 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 2.00222712 |
|---|---|
| Coefficient of variation (CV) | 1.571923923 |
| Kurtosis | 10.92274489 |
| Mean | 1.273743017 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.768483866 |
| Sum | 228 |
| Variance | 4.008913439 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 92 | |
| 2 | 30 | 16.8% |
| 1 | 28 | 15.6% |
| 3 | 11 | 6.1% |
| 4 | 9 | 5.0% |
| 7 | 3 | 1.7% |
| 8 | 2 | 1.1% |
| 5 | 1 | 0.6% |
| 9 | 1 | 0.6% |
| 14 | 1 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 92 | |
| 1 | 28 | 15.6% |
| 2 | 30 | 16.8% |
| 3 | 11 | 6.1% |
| 4 | 9 | 5.0% |
| 5 | 1 | 0.6% |
| 6 | 1 | 0.6% |
| 7 | 3 | 1.7% |
| 8 | 2 | 1.1% |
| 9 | 1 | 0.6% |
| Value | Count | Frequency (%) |
| 14 | 1 | 0.6% |
| 9 | 1 | 0.6% |
| 8 | 2 | 1.1% |
| 7 | 3 | 1.7% |
| 6 | 1 | 0.6% |
| 5 | 1 | 0.6% |
| 4 | 9 | 5.0% |
| 3 | 11 | 6.1% |
| 2 | 30 | |
| 1 | 28 |
| Distinct | 8 |
|---|---|
| Distinct (%) | 4.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5865921788 |
| Minimum | 0 |
|---|---|
| Maximum | 8 |
| Zeros | 121 |
| Zeros (%) | 67.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.164570185 |
|---|---|
| Coefficient of variation (CV) | 1.985314886 |
| Kurtosis | 12.50625325 |
| Mean | 0.5865921788 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.100941333 |
| Sum | 105 |
| Variance | 1.356223715 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 121 | |
| 1 | 36 | 20.1% |
| 2 | 10 | 5.6% |
| 3 | 6 | 3.4% |
| 4 | 3 | 1.7% |
| 5 | 1 | 0.6% |
| 8 | 1 | 0.6% |
| 6 | 1 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 121 | |
| 1 | 36 | 20.1% |
| 2 | 10 | 5.6% |
| 3 | 6 | 3.4% |
| 4 | 3 | 1.7% |
| 5 | 1 | 0.6% |
| 6 | 1 | 0.6% |
| 8 | 1 | 0.6% |
| Value | Count | Frequency (%) |
| 8 | 1 | 0.6% |
| 6 | 1 | 0.6% |
| 5 | 1 | 0.6% |
| 4 | 3 | 1.7% |
| 3 | 6 | 3.4% |
| 2 | 10 | 5.6% |
| 1 | 36 | 20.1% |
| 0 | 121 |
| Distinct | 2 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 | |
|---|---|
| 1 | 5 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 174 | |
| 1 | 5 | 2.8% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 174 | |
| 1 | 5 | 2.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 174 | |
| 1 | 5 | 2.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 174 | |
| 1 | 5 | 2.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 174 | |
| 1 | 5 | 2.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 174 | |
| 1 | 5 | 2.8% |
| Distinct | 2 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 | |
|---|---|
| 1 | 7 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 172 | |
| 1 | 7 | 3.9% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 172 | |
| 1 | 7 | 3.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 172 | |
| 1 | 7 | 3.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 172 | |
| 1 | 7 | 3.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 172 | |
| 1 | 7 | 3.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 172 | |
| 1 | 7 | 3.9% |
| Distinct | 6 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4860335196 |
| Minimum | 0 |
|---|---|
| Maximum | 5 |
| Zeros | 134 |
| Zeros (%) | 74.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0.5 |
| 95-th percentile | 3 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0.5 |
Descriptive statistics
| Standard deviation | 1.045899118 |
|---|---|
| Coefficient of variation (CV) | 2.151907381 |
| Kurtosis | 7.241646686 |
| Mean | 0.4860335196 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.644464033 |
| Sum | 87 |
| Variance | 1.093904965 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 134 | |
| 1 | 23 | 12.8% |
| 2 | 11 | 6.1% |
| 3 | 6 | 3.4% |
| 5 | 4 | 2.2% |
| 4 | 1 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 134 | |
| 1 | 23 | 12.8% |
| 2 | 11 | 6.1% |
| 3 | 6 | 3.4% |
| 4 | 1 | 0.6% |
| 5 | 4 | 2.2% |
| Value | Count | Frequency (%) |
| 5 | 4 | 2.2% |
| 4 | 1 | 0.6% |
| 3 | 6 | 3.4% |
| 2 | 11 | 6.1% |
| 1 | 23 | 12.8% |
| 0 | 134 |
| Distinct | 2 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 | |
|---|---|
| 1 | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.6% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 178 | |
| 1 | 1 | 0.6% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 178 | |
| 1 | 1 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 178 | |
| 1 | 1 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 178 | |
| 1 | 1 | 0.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 178 | |
| 1 | 1 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 178 | |
| 1 | 1 | 0.6% |
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 1 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 179 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 179 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 179 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 179 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 179 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 179 |
| Distinct | 4 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 | |
|---|---|
| 1 | |
| 2 | 5 |
| 3 | 2 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 155 | |
| 1 | 17 | 9.5% |
| 2 | 5 | 2.8% |
| 3 | 2 | 1.1% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 155 | |
| 1 | 17 | 9.5% |
| 2 | 5 | 2.8% |
| 3 | 2 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 155 | |
| 1 | 17 | 9.5% |
| 2 | 5 | 2.8% |
| 3 | 2 | 1.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 155 | |
| 1 | 17 | 9.5% |
| 2 | 5 | 2.8% |
| 3 | 2 | 1.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 155 | |
| 1 | 17 | 9.5% |
| 2 | 5 | 2.8% |
| 3 | 2 | 1.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 155 | |
| 1 | 17 | 9.5% |
| 2 | 5 | 2.8% |
| 3 | 2 | 1.1% |
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 1 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 179 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 179 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 179 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 179 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 179 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 179 |
| Distinct | 5 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 | |
|---|---|
| 1 | |
| 2 | 4 |
| 4 | 3 |
| 3 | 2 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 147 | |
| 1 | 23 | 12.8% |
| 2 | 4 | 2.2% |
| 4 | 3 | 1.7% |
| 3 | 2 | 1.1% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 147 | |
| 1 | 23 | 12.8% |
| 2 | 4 | 2.2% |
| 4 | 3 | 1.7% |
| 3 | 2 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 147 | |
| 1 | 23 | 12.8% |
| 2 | 4 | 2.2% |
| 4 | 3 | 1.7% |
| 3 | 2 | 1.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 147 | |
| 1 | 23 | 12.8% |
| 2 | 4 | 2.2% |
| 4 | 3 | 1.7% |
| 3 | 2 | 1.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 147 | |
| 1 | 23 | 12.8% |
| 2 | 4 | 2.2% |
| 4 | 3 | 1.7% |
| 3 | 2 | 1.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 147 | |
| 1 | 23 | 12.8% |
| 2 | 4 | 2.2% |
| 4 | 3 | 1.7% |
| 3 | 2 | 1.1% |
| Distinct | 4 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 | |
|---|---|
| 1 | |
| 2 | 7 |
| 3 | 2 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 149 | |
| 1 | 21 | 11.7% |
| 2 | 7 | 3.9% |
| 3 | 2 | 1.1% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 149 | |
| 1 | 21 | 11.7% |
| 2 | 7 | 3.9% |
| 3 | 2 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 149 | |
| 1 | 21 | 11.7% |
| 2 | 7 | 3.9% |
| 3 | 2 | 1.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 149 | |
| 1 | 21 | 11.7% |
| 2 | 7 | 3.9% |
| 3 | 2 | 1.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 149 | |
| 1 | 21 | 11.7% |
| 2 | 7 | 3.9% |
| 3 | 2 | 1.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 149 | |
| 1 | 21 | 11.7% |
| 2 | 7 | 3.9% |
| 3 | 2 | 1.1% |
| Distinct | 8 |
|---|---|
| Distinct (%) | 4.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.078212291 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 88 |
| Zeros (%) | 49.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.508226557 |
|---|---|
| Coefficient of variation (CV) | 1.398821522 |
| Kurtosis | 6.842243999 |
| Mean | 1.078212291 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 2.121183256 |
| Sum | 193 |
| Variance | 2.274747348 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 88 | |
| 1 | 42 | |
| 2 | 23 | 12.8% |
| 3 | 11 | 6.1% |
| 4 | 9 | 5.0% |
| 5 | 4 | 2.2% |
| 6 | 1 | 0.6% |
| 10 | 1 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 88 | |
| 1 | 42 | |
| 2 | 23 | 12.8% |
| 3 | 11 | 6.1% |
| 4 | 9 | 5.0% |
| 5 | 4 | 2.2% |
| 6 | 1 | 0.6% |
| 10 | 1 | 0.6% |
| Value | Count | Frequency (%) |
| 10 | 1 | 0.6% |
| 6 | 1 | 0.6% |
| 5 | 4 | 2.2% |
| 4 | 9 | 5.0% |
| 3 | 11 | 6.1% |
| 2 | 23 | 12.8% |
| 1 | 42 | |
| 0 | 88 |
| Distinct | 4 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 | |
|---|---|
| 1 | 13 |
| 2 | 8 |
| 3 | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.6% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 157 | |
| 1 | 13 | 7.3% |
| 2 | 8 | 4.5% |
| 3 | 1 | 0.6% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 157 | |
| 1 | 13 | 7.3% |
| 2 | 8 | 4.5% |
| 3 | 1 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 157 | |
| 1 | 13 | 7.3% |
| 2 | 8 | 4.5% |
| 3 | 1 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 157 | |
| 1 | 13 | 7.3% |
| 2 | 8 | 4.5% |
| 3 | 1 | 0.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 157 | |
| 1 | 13 | 7.3% |
| 2 | 8 | 4.5% |
| 3 | 1 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 157 | |
| 1 | 13 | 7.3% |
| 2 | 8 | 4.5% |
| 3 | 1 | 0.6% |
| Distinct | 10 |
|---|---|
| Distinct (%) | 5.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.379888268 |
| Minimum | 0 |
|---|---|
| Maximum | 9 |
| Zeros | 76 |
| Zeros (%) | 42.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 5 |
| Maximum | 9 |
| Range | 9 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.796012434 |
|---|---|
| Coefficient of variation (CV) | 1.301563666 |
| Kurtosis | 3.785793805 |
| Mean | 1.379888268 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.827837447 |
| Sum | 247 |
| Variance | 3.225660662 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 76 | |
| 1 | 42 | |
| 2 | 27 | 15.1% |
| 4 | 13 | 7.3% |
| 3 | 11 | 6.1% |
| 5 | 4 | 2.2% |
| 8 | 3 | 1.7% |
| 7 | 1 | 0.6% |
| 9 | 1 | 0.6% |
| 6 | 1 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 76 | |
| 1 | 42 | |
| 2 | 27 | 15.1% |
| 3 | 11 | 6.1% |
| 4 | 13 | 7.3% |
| 5 | 4 | 2.2% |
| 6 | 1 | 0.6% |
| 7 | 1 | 0.6% |
| 8 | 3 | 1.7% |
| 9 | 1 | 0.6% |
| Value | Count | Frequency (%) |
| 9 | 1 | 0.6% |
| 8 | 3 | 1.7% |
| 7 | 1 | 0.6% |
| 6 | 1 | 0.6% |
| 5 | 4 | 2.2% |
| 4 | 13 | 7.3% |
| 3 | 11 | 6.1% |
| 2 | 27 | 15.1% |
| 1 | 42 | |
| 0 | 76 |
| Distinct | 5 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 | |
|---|---|
| 1 | |
| 2 | 4 |
| 4 | 1 |
| 3 | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | 1.1% |
Sample
| 1st row | 2 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 156 | |
| 1 | 17 | 9.5% |
| 2 | 4 | 2.2% |
| 4 | 1 | 0.6% |
| 3 | 1 | 0.6% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 156 | |
| 1 | 17 | 9.5% |
| 2 | 4 | 2.2% |
| 4 | 1 | 0.6% |
| 3 | 1 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 156 | |
| 1 | 17 | 9.5% |
| 2 | 4 | 2.2% |
| 4 | 1 | 0.6% |
| 3 | 1 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 156 | |
| 1 | 17 | 9.5% |
| 2 | 4 | 2.2% |
| 4 | 1 | 0.6% |
| 3 | 1 | 0.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 156 | |
| 1 | 17 | 9.5% |
| 2 | 4 | 2.2% |
| 4 | 1 | 0.6% |
| 3 | 1 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 156 | |
| 1 | 17 | 9.5% |
| 2 | 4 | 2.2% |
| 4 | 1 | 0.6% |
| 3 | 1 | 0.6% |
| Distinct | 3 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 | |
|---|---|
| 1 | 6 |
| 2 | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.6% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 172 | |
| 1 | 6 | 3.4% |
| 2 | 1 | 0.6% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 172 | |
| 1 | 6 | 3.4% |
| 2 | 1 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 172 | |
| 1 | 6 | 3.4% |
| 2 | 1 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 172 | |
| 1 | 6 | 3.4% |
| 2 | 1 | 0.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 172 | |
| 1 | 6 | 3.4% |
| 2 | 1 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 172 | |
| 1 | 6 | 3.4% |
| 2 | 1 | 0.6% |
| Distinct | 2 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 | |
|---|---|
| 1 | 6 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 173 | |
| 1 | 6 | 3.4% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 173 | |
| 1 | 6 | 3.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 173 | |
| 1 | 6 | 3.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 173 | |
| 1 | 6 | 3.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 173 | |
| 1 | 6 | 3.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 173 | |
| 1 | 6 | 3.4% |
| Distinct | 3 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.3 KiB |
| 0 | |
|---|---|
| 1 | 3 |
| 2 | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 179 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.6% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 175 | |
| 1 | 3 | 1.7% |
| 2 | 1 | 0.6% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 175 | |
| 1 | 3 | 1.7% |
| 2 | 1 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 175 | |
| 1 | 3 | 1.7% |
| 2 | 1 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 179 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 175 | |
| 1 | 3 | 1.7% |
| 2 | 1 | 0.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 179 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 175 | |
| 1 | 3 | 1.7% |
| 2 | 1 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 179 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 175 | |
| 1 | 3 | 1.7% |
| 2 | 1 | 0.6% |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| df_index | Name | Description | Type | lang | Description_clean | word_count | char_count | sentence_count | avg_word_length | avg_sentence_length | name_word_count | name_char_count | name_avg_word_length | Polarity | parsed | entity_tags | entity_types | CARDINAL | DATE | EVENT | FAC | GPE | LANGUAGE | LAW | LOC | MONEY | NORP | ORDINAL | ORG | PERCENT | PERSON | PRODUCT | QUANTITY | TIME | WORK_OF_ART | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | Acinetobacter baumannii | Acinetobacter baumannii is a typically short, almost round, rod-shaped (coccobacillus) Gram-negative bacterium. It is named after the bacteriologist Paul Baumann. It can be an opportunistic pathogen in humans, affecting people with compromised immune systems, and is becoming increasingly important as a hospital-derived (nosocomial) infection. While other species of the genus Acinetobacter are often found in soil samples (leading to the common misconception that A. baumannii is a soil organism, too), it is almost exclusively isolated from hospital environments. Although occasionally it has been found in environmental soil and water samples, its natural habitat is still not known.\nBacteria of this genus lack flagella, whip-like structures many bacteria use for locomotion, but exhibit twitching or swarming motility. This may be due to the activity of type IV pili, pole-like structures that can be extended and retracted. Motility in A. baumannii may also be due to the excretion of exopolysaccharide, creating a film of high-molecular-weight sugar chains behind the bacterium to move forward. Clinical microbiologists typically differentiate members of the genus Acinetobacter from other Moraxellaceae by performing an oxidase test, as Acinetobacter spp. are the only members of the Moraxellaceae to lack cytochrome c oxidases.A. baumannii is part of the ACB complex (A. baumannii, A. calcoaceticus, and Acinetobacter genomic species 13TU). It is difficult to determine the specific species of members of the ACB complex and they comprise the most clinically relevant members of the genus. A. baumannii has also been identified as an ESKAPE pathogen (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species), a group of pathogens with a high rate of antibiotic resistance that are responsible for the majority of nosocomial infections.Colloquially, A. baumannii is referred to as "Iraqibacter" due to its seemingly sudden emergence in military treatment facilities during the Iraq War. It has continued to be an issue for veterans and soldiers who served in Iraq and Afghanistan. Multidrug-resistant A. baumannii has spread to civilian hospitals in part due to the transport of infected soldiers through multiple medical facilities. During the COVID-19 pandemic, coinfection with A. baumannii secondary to SARS-CoV-2 infections has been reported multiple times in literature. | Bacteria | en | acinetobacter baumannii typically short almost round rodshaped coccobacillus gramnegative bacterium named bacteriologist paul baumann opportunistic pathogen human affecting people compromised immune system becoming increasingly important hospitalderived nosocomial infection specie genus acinetobacter often found soil sample leading common misconception baumannii soil organism almost exclusively isolated hospital environment although occasionally found environmental soil water sample natural habitat still known bacteria genus lack flagellum whiplike structure many bacteria use locomotion exhibit twitching swarming motility may due activity type iv pili polelike structure extended retracted motility baumannii may also due excretion exopolysaccharide creating film highmolecularweight sugar chain behind bacterium move forward clinical microbiologist typically differentiate member genus acinetobacter moraxellaceae performing oxidase test acinetobacter spp member moraxellaceae lack cytochrome c oxidasesa baumannii part acb complex baumannii calcoaceticus acinetobacter genomic specie 13tu difficult determine specific specie member acb complex comprise clinically relevant member genus baumannii also identified eskape pathogen enterococcus faecium staphylococcus aureus klebsiella pneumoniae acinetobacter baumannii pseudomonas aeruginosa enterobacter specie group pathogen high rate antibiotic resistance responsible majority nosocomial infectionscolloquially baumannii referred iraqibacter due seemingly sudden emergence military treatment facility iraq war continued issue veteran soldier served iraq afghanistan multidrugresistant baumannii spread civilian hospital part due transport infected soldier multiple medical facility covid19 pandemic coinfection baumannii secondary sarscov2 infection reported multiple time literature | 354 | 2118 | 27 | 5.983051 | 13.111111 | 2 | 22 | 11.0 | -0.019570 | (acinetobacter, baumannii, typically, short, almost, round, rodshaped, coccobacillus, gramnegative, bacterium, named, bacteriologist, paul, baumann, opportunistic, pathogen, human, affecting, people, compromised, immune, system, becoming, increasingly, important, hospitalderived, nosocomial, infection, specie, genus, acinetobacter, often, found, soil, sample, leading, common, misconception, baumannii, soil, organism, almost, exclusively, isolated, hospital, environment, although, occasionally, found, environmental, soil, water, sample, natural, habitat, still, known, bacteria, genus, lack, flagellum, whiplike, structure, many, bacteria, use, locomotion, exhibit, twitching, swarming, motility, may, due, activity, type, iv, pili, polelike, structure, extended, retracted, motility, baumannii, may, also, due, excretion, exopolysaccharide, creating, film, highmolecularweight, sugar, chain, behind, bacterium, move, forward, clinical, microbiologist, typically, ...) | [(Paul Baumann, PERSON), (Acinetobacter, LOC), (IV pili, PERSON), (Moraxellaceae, PERSON), (ACB, ORG), (Acinetobacter, PRODUCT), (13TU, CARDINAL), (ACB, ORG), (ESKAPE, ORG), (Acinetobacter, PRODUCT), (Enterobacter, PERSON), (the Iraq War, EVENT), (Iraq, GPE), (Afghanistan, GPE)] | [1, 0, 1, 0, 2, 0, 0, 1, 0, 0, 0, 3, 0, 4, 2, 0, 0, 0] | 1 | 0 | 1 | 0 | 2 | 0 | 0 | 1 | 0 | 0 | 0 | 3 | 0 | 4 | 2 | 0 | 0 | 0 |
| 1 | 1 | Actinomyces israelii | Actinomyces israelii is a species of Gram-positive, rod-shaped bacteria within the genus Actinomyces. Known to live commensally on and within humans, A. israelii is an opportunistic pathogen and a cause of actinomycosis. Many physiologically diverse strains of the species are known to exist, though not all are strict anaerobes. It was named after the German surgeon James Adolf Israel (1848–1926), who studied the organism for the first time in 1878. | Bacteria | en | actinomyces israelii specie grampositive rodshaped bacteria within genus actinomyces known live commensally within human israelii opportunistic pathogen cause actinomycosis many physiologically diverse strain specie known exist though strict anaerobe named german surgeon james adolf israel 18481926 studied organism first time 1878 | 72 | 383 | 6 | 5.319444 | 12.000000 | 2 | 19 | 9.5 | 0.221591 | (actinomyces, israelii, specie, grampositive, rodshaped, bacteria, within, genus, actinomyces, known, live, commensally, within, human, israelii, opportunistic, pathogen, cause, actinomycosis, many, physiologically, diverse, strain, specie, known, exist, though, strict, anaerobe, named, german, surgeon, james, adolf, israel, 18481926, studied, organism, first, time, 1878) | [(Actinomyces, ORG), (Actinomyces, ORG), (German, NORP), (James Adolf, PERSON), (Israel, GPE), (1848–1926, CARDINAL), (first, ORDINAL), (1878, DATE)] | [1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 2, 0, 1, 0, 0, 0, 0] | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 2 | 0 | 1 | 0 | 0 | 0 | 0 |
| 2 | 2 | Agrobacterium tumefaciens | Agrobacterium radiobacter (more commonly known as Agrobacterium tumefaciens) is the causal agent of crown gall disease (the formation of tumours) in over 140 species of eudicots. It is a rod-shaped, Gram-negative soil bacterium. Symptoms are caused by the insertion of a small segment of DNA (known as the T-DNA, for 'transfer DNA', not to be confused with tRNA that transfers amino acids during protein synthesis), from a plasmid into the plant cell, which is incorporated at a semi-random location into the plant genome. Plant genomes can be engineered by use of Agrobacterium for the delivery of sequences hosted in T-DNA binary vectors.\nAgrobacterium tumefaciens is an alphaproteobacterium of the family Rhizobiaceae, which includes the nitrogen-fixing legume symbionts. Unlike the nitrogen-fixing symbionts, tumor-producing Agrobacterium species are pathogenic and do not benefit the plant. The wide variety of plants affected by Agrobacterium makes it of great concern to the agriculture industry.Economically, A. tumefaciens is a serious pathogen of walnuts, grape vines, stone fruits, nut trees, sugar beets, horse radish, and rhubarb, and the persistent nature of the tumors or galls caused by the disease make it particularly harmful for perennial crops.Agrobacterium tumefaciens grows optimally at 28 °C. The doubling time can range from 2.5–4h depending on the media, culture format, and level of aeration. At temperatures above 30 °C, A. tumefaciens begins to experience heat shock which is likely to result in errors in cell division. | Bacteria | en | agrobacterium radiobacter commonly known agrobacterium tumefaciens causal agent crown gall disease formation tumour 140 specie eudicots rodshaped gramnegative soil bacterium symptom caused insertion small segment dna known tdna transfer dna confused trna transfer amino acid protein synthesis plasmid plant cell incorporated semirandom location plant genome plant genome engineered use agrobacterium delivery sequence hosted tdna binary vector agrobacterium tumefaciens alphaproteobacterium family rhizobiaceae includes nitrogenfixing legume symbionts unlike nitrogenfixing symbionts tumorproducing agrobacterium specie pathogenic benefit plant wide variety plant affected agrobacterium make great concern agriculture industryeconomically tumefaciens serious pathogen walnut grape vine stone fruit nut tree sugar beet horse radish rhubarb persistent nature tumor gall caused disease make particularly harmful perennial cropsagrobacterium tumefaciens grows optimally 28 c doubling time range 254h depending medium culture format level aeration temperature 30 c tumefaciens begin experience heat shock likely result error cell division | 235 | 1316 | 15 | 5.600000 | 15.666667 | 2 | 24 | 12.0 | 0.008333 | (agrobacterium, radiobacter, commonly, known, agrobacterium, tumefaciens, causal, agent, crown, gall, disease, formation, tumour, 140, specie, eudicots, rodshaped, gramnegative, soil, bacterium, symptom, caused, insertion, small, segment, dna, known, tdna, transfer, dna, confused, trna, transfer, amino, acid, protein, synthesis, plasmid, plant, cell, incorporated, semirandom, location, plant, genome, plant, genome, engineered, use, agrobacterium, delivery, sequence, hosted, tdna, binary, vector, agrobacterium, tumefaciens, alphaproteobacterium, family, rhizobiaceae, includes, nitrogenfixing, legume, symbionts, unlike, nitrogenfixing, symbionts, tumorproducing, agrobacterium, specie, pathogenic, benefit, plant, wide, variety, plant, affected, agrobacterium, make, great, concern, agriculture, industryeconomically, tumefaciens, serious, pathogen, walnut, grape, vine, stone, fruit, nut, tree, sugar, beet, horse, radish, rhubarb, persistent, ...) | [(over 140, CARDINAL), (Rhizobiaceae, PERSON), (28 °C, CARDINAL), (2.5–4h, CARDINAL)] | [3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0] | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
| 3 | 3 | Anaplasma | Anaplasma is a genus of bacteria of the alphaproteobacterial order Rickettsiales, family Anaplasmataceae.\nAnaplasma species reside in host blood cells and lead to the disease anaplasmosis. The disease most commonly occurs in areas where competent tick vectors are indigenous, including tropical and semitropical areas of the world for intraerythrocytic Anaplasma spp.Anaplasma species are biologically transmitted by Ixodes deer-tick vectors, and the prototypical species, A. marginale, can be mechanically transmitted by biting flies and iatrogenically with blood-contaminated instruments. One of the major consequences of infection by bovine red blood cells by A. marginale is the development of nonhaemolytic anaemia, thus the absence of hemoglobinuria, which allows clinical differentiation from another major tick-borne disease, bovine babesiosis, caused by Babesia bigemina.Species of veterinary interest include:\n\nAnaplasma marginale and Anaplasma centrale in cattle\nAnaplasma ovis and Anaplasma mesaeterum in sheep and goats\nAnaplasma phagocytophilum in dogs, cats, and horses (see human granulocytic anaplasmosis)\nAnaplasma platys in dogs | Bacteria | en | anaplasma genus bacteria alphaproteobacterial order rickettsiales family anaplasmataceae anaplasma specie reside host blood cell lead disease anaplasmosis disease commonly occurs area competent tick vector indigenous including tropical semitropical area world intraerythrocytic anaplasma sppanaplasma specie biologically transmitted ixodes deertick vector prototypical specie marginale mechanically transmitted biting fly iatrogenically bloodcontaminated instrument one major consequence infection bovine red blood cell marginale development nonhaemolytic anaemia thus absence hemoglobinuria allows clinical differentiation another major tickborne disease bovine babesiosis caused babesia bigeminaspecies veterinary interest include anaplasma marginale anaplasma centrale cattle anaplasma ovis anaplasma mesaeterum sheep goat anaplasma phagocytophilum dog cat horse see human granulocytic anaplasmosis anaplasma platy dog | 148 | 1000 | 8 | 6.756757 | 18.500000 | 1 | 9 | 9.0 | 0.101562 | (anaplasma, genus, bacteria, alphaproteobacterial, order, rickettsiales, family, anaplasmataceae, anaplasma, specie, reside, host, blood, cell, lead, disease, anaplasmosis, disease, commonly, occurs, area, competent, tick, vector, indigenous, including, tropical, semitropical, area, world, intraerythrocytic, anaplasma, sppanaplasma, specie, biologically, transmitted, ixodes, deertick, vector, prototypical, specie, marginale, mechanically, transmitted, biting, fly, iatrogenically, bloodcontaminated, instrument, one, major, consequence, infection, bovine, red, blood, cell, marginale, development, nonhaemolytic, anaemia, thus, absence, hemoglobinuria, allows, clinical, differentiation, another, major, tickborne, disease, bovine, babesiosis, caused, babesia, bigeminaspecies, veterinary, interest, include, anaplasma, marginale, anaplasma, centrale, cattle, anaplasma, ovis, anaplasma, mesaeterum, sheep, goat, anaplasma, phagocytophilum, dog, cat, horse, see, human, granulocytic, anaplasmosis, anaplasma, ...) | [(Anaplasma, PERSON), (Anaplasmataceae, ORG), (Anaplasma, PERSON), (Anaplasma, PERSON), (Anaplasma, PERSON), (Ixodes, ORG), (One, CARDINAL), (Anaplasma, PERSON), (Anaplasma, GPE), (Anaplasma ovis, PERSON), (Anaplasma, PERSON), (Anaplasma, PERSON)] | [1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 2, 0, 8, 0, 0, 0, 0] | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 8 | 0 | 0 | 0 | 0 |
| 4 | 4 | Anaplasma phagocytophilum | Anaplasma phagocytophilum (formerly Ehrlichia phagocytophilum) is a Gram-negative bacterium that is unusual in its tropism to neutrophils. It causes anaplasmosis in sheep and cattle, also known as tick-borne fever and pasture fever, and also causes the zoonotic disease human granulocytic anaplasmosis.A. phagocytophilum is a Gram-negative, obligate bacterium of neutrophils. It causes human granulocytic anaplasmosis, which is a tick-borne rickettsial disease. Because this bacterium invades neutrophils, it has a unique adaptation and pathogenetic mechanism. | Bacteria | en | anaplasma phagocytophilum formerly ehrlichia phagocytophilum gramnegative bacterium unusual tropism neutrophil cause anaplasmosis sheep cattle also known tickborne fever pasture fever also cause zoonotic disease human granulocytic anaplasmosisa phagocytophilum gramnegative obligate bacterium neutrophil cause human granulocytic anaplasmosis tickborne rickettsial disease bacterium invades neutrophil unique adaptation pathogenetic mechanism | 73 | 488 | 7 | 6.684932 | 10.428571 | 2 | 24 | 12.0 | 0.115000 | (anaplasma, phagocytophilum, formerly, ehrlichia, phagocytophilum, gramnegative, bacterium, unusual, tropism, neutrophil, cause, anaplasmosis, sheep, cattle, also, known, tickborne, fever, pasture, fever, also, cause, zoonotic, disease, human, granulocytic, anaplasmosisa, phagocytophilum, gramnegative, obligate, bacterium, neutrophil, cause, human, granulocytic, anaplasmosis, tickborne, rickettsial, disease, bacterium, invades, neutrophil, unique, adaptation, pathogenetic, mechanism) | [(Anaplasma, PERSON), (Ehrlichia, ORG), (A. phagocytophilum, PERSON)] | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 2, 0, 0, 0, 0] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 2 | 0 | 0 | 0 | 0 |
| 5 | 5 | Azorhizobium caulinodans | Azorhizobium caulinodans is a species of bacteria that forms a nitrogen-fixing symbiosis with plants of the genus Sesbania. The symbiotic relationship between Sesbania rostrata and A. caulinodans lead to nitrogen fixing nodules in S. rostrata. Bacterial chemotaxis plays an important role in establishing this symbiotic relationship.Azorhizobium caulinodans is a genome and it contains chemotaxis gene clusters that are unique. It has five chemotaxis genes which are: cheW(1), cheW, cheA, cheR, and cheB. Azorhizobium caulinodans controls the movements of flagella, and the chemotaxis signaling path in Azorhizobium caulinodans helps with regulating biofilm formation. | Bacteria | en | azorhizobium caulinodans specie bacteria form nitrogenfixing symbiosis plant genus sesbania symbiotic relationship sesbania rostrata caulinodans lead nitrogen fixing nodule rostrata bacterial chemotaxis play important role establishing symbiotic relationshipazorhizobium caulinodans genome contains chemotaxis gene cluster unique five chemotaxis gene chew1 chew chea cher cheb azorhizobium caulinodans control movement flagellum chemotaxis signaling path azorhizobium caulinodans help regulating biofilm formation | 92 | 577 | 9 | 6.271739 | 10.222222 | 2 | 23 | 11.5 | 0.387500 | (azorhizobium, caulinodans, specie, bacteria, form, nitrogenfixing, symbiosis, plant, genus, sesbania, symbiotic, relationship, sesbania, rostrata, caulinodans, lead, nitrogen, fixing, nodule, rostrata, bacterial, chemotaxis, play, important, role, establishing, symbiotic, relationshipazorhizobium, caulinodans, genome, contains, chemotaxis, gene, cluster, unique, five, chemotaxis, gene, chew1, chew, chea, cher, cheb, azorhizobium, caulinodans, control, movement, flagellum, chemotaxis, signaling, path, azorhizobium, caulinodans, help, regulating, biofilm, formation) | [(Azorhizobium, ORG), (Sesbania, GPE), (Sesbania, GPE), (five, CARDINAL), (cheW, cheA, ORG), (cheR, ORG), (Azorhizobium, ORG), (Azorhizobium, ORG)] | [1, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0] | 1 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 0 | 0 | 0 | 0 | 0 | 0 |
| 6 | 6 | Azotobacter vinelandii | Azotobacter vinelandii is Gram-negative diazotroph that can fix nitrogen while grown aerobically. These bacteria are easily cultured and grown.\nA. vinelandii is a free-living N2 fixer known to produce many phytohormones and vitamins in soils. It produces fluorescent pyoverdine pigments. | Bacteria | en | azotobacter vinelandii gramnegative diazotroph fix nitrogen grown aerobically bacteria easily cultured grown vinelandii freeliving n2 fixer known produce many phytohormone vitamin soil produce fluorescent pyoverdine pigment | 40 | 249 | 6 | 6.225000 | 6.666667 | 2 | 21 | 10.5 | 0.466667 | (azotobacter, vinelandii, gramnegative, diazotroph, fix, nitrogen, grown, aerobically, bacteria, easily, cultured, grown, vinelandii, freeliving, n2, fixer, known, produce, many, phytohormone, vitamin, soil, produce, fluorescent, pyoverdine, pigment) | [(N2, CARDINAL)] | [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 7 | 7 | Bacillus | Bacillus (Latin "stick") is a genus of Gram-positive, rod-shaped bacteria, a member of the phylum Bacillota, with 266 named species. The term is also used to describe the shape (rod) of certain bacteria; and the plural Bacilli is the name of the class of bacteria to which this genus belongs. Bacillus species can be either obligate aerobes: oxygen dependent; or facultative anaerobes: having the ability to continue living in the absence of oxygen. Cultured Bacillus species test positive for the enzyme catalase if oxygen has been used or is present.Bacillus can reduce themselves to oval endospores and can remain in this dormant state for years. The endospore of one species from Morocco is reported to have survived being heated to 420 °C. Endospore formation is usually triggered by a lack of nutrients: the bacterium divides within its cell wall, and one side then engulfs the other. They are not true spores (i.e., not an offspring). Endospore formation originally defined the genus, but not all such species are closely related, and many species have been moved to other genera of the Bacillota. Only one endospore is formed per cell. The spores are resistant to heat, cold, radiation, desiccation, and disinfectants. Bacillus anthracis needs oxygen to sporulate; this constraint has important consequences for epidemiology and control. In vivo, B. anthracis produces a polypeptide (polyglutamic acid) capsule that kills it from phagocytosis. The genera Bacillus and Clostridium constitute the family Bacillaceae. Species are identified by using morphologic and biochemical criteria. Because the spores of many Bacillus species are resistant to heat, radiation, disinfectants, and desiccation, they are difficult to eliminate from medical and pharmaceutical materials and are a frequent cause of contamination. Not only are they resistant to heat, radiation, etc., but they are also resistant to chemicals such as antibiotics. This resistance allows them to survive for many years and especially in a controlled environment. Bacillus species are well known in the food industries as troublesome spoilage organisms.Ubiquitous in nature, Bacillus includes symbiotic(sometimes referred to as endophytes) as well as independent species. Two parasitic pathogenic species are medically significant: B. anthracis causes anthrax; and B. cereus causes food poisoning.\nMany species of Bacillus can produce copious amounts of enzymes, which are used in various industries, such as in the production of alpha amylase used in starch hydrolysis and the protease subtilisin used in detergents. B. subtilis is a valuable model for bacterial research. \nSome Bacillus species can synthesize and secrete lipopeptides, in particular surfactins and mycosubtilins. Bacillus species are also found in marine sponges. Marine sponge associated Bacillus subtilis (strains WS1A and YBS29) can synthesize several antimicrobial peptides. These Bacillus subtilis strains can develop disease resistance in Labeo rohita. | Bacteria | en | bacillus latin stick genus grampositive rodshaped bacteria member phylum bacillota 266 named specie term also used describe shape rod certain bacteria plural bacillus name class bacteria genus belongs bacillus specie either obligate aerobe oxygen dependent facultative anaerobe ability continue living absence oxygen cultured bacillus specie test positive enzyme catalase oxygen used presentbacillus reduce oval endospore remain dormant state year endospore one specie morocco reported survived heated 420 c endospore formation usually triggered lack nutrient bacterium divide within cell wall one side engulfs true spore ie offspring endospore formation originally defined genus specie closely related many specie moved genus bacillota one endospore formed per cell spore resistant heat cold radiation desiccation disinfectant bacillus anthracis need oxygen sporulate constraint important consequence epidemiology control vivo b anthracis produce polypeptide polyglutamic acid capsule kill phagocytosis genus bacillus clostridium constitute family bacillaceae specie identified using morphologic biochemical criterion spore many bacillus specie resistant heat radiation disinfectant desiccation difficult eliminate medical pharmaceutical material frequent cause contamination resistant heat radiation etc also resistant chemical antibiotic resistance allows survive many year especially controlled environment bacillus specie well known food industry troublesome spoilage organismsubiquitous nature bacillus includes symbioticsometimes referred endophytes well independent specie two parasitic pathogenic specie medically significant b anthracis cause anthrax b cereus cause food poisoning many specie bacillus produce copious amount enzyme used various industry production alpha amylase used starch hydrolysis protease subtilisin used detergent b subtilis valuable model bacterial research bacillus specie synthesize secrete lipopeptides particular surfactins mycosubtilins bacillus specie also found marine sponge marine sponge associated bacillus subtilis strain ws1a ybs29 synthesize several antimicrobial peptide bacillus subtilis strain develop disease resistance labeo rohita | 450 | 2552 | 35 | 5.671111 | 12.857143 | 1 | 8 | 8.0 | 0.071404 | (bacillus, latin, stick, genus, grampositive, rodshaped, bacteria, member, phylum, bacillota, 266, named, specie, term, also, used, describe, shape, rod, certain, bacteria, plural, bacillus, name, class, bacteria, genus, belongs, bacillus, specie, either, obligate, aerobe, oxygen, dependent, facultative, anaerobe, ability, continue, living, absence, oxygen, cultured, bacillus, specie, test, positive, enzyme, catalase, oxygen, used, presentbacillus, reduce, oval, endospore, remain, dormant, state, year, endospore, one, specie, morocco, reported, survived, heated, 420, c, endospore, formation, usually, triggered, lack, nutrient, bacterium, divide, within, cell, wall, one, side, engulfs, true, spore, ie, offspring, endospore, formation, originally, defined, genus, specie, closely, related, many, specie, moved, genus, bacillota, one, ...) | [(Latin, NORP), (Bacillota, ORG), (266, CARDINAL), (Bacilli, PERSON), (years, DATE), (one, CARDINAL), (Morocco, GPE), (420 °, CARDINAL), (Bacillota, LOC), (Only one, CARDINAL), (vivo, GPE), (many years, DATE), (Two, CARDINAL), (Labeo, GPE)] | [5, 2, 0, 0, 3, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0] | 5 | 2 | 0 | 0 | 3 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 8 | 8 | Bacillus anthracis | Bacillus anthracis is a Gram-positive and rod-shaped bacterium that causes anthrax, a deadly disease to livestock and, occasionally, to humans. It is the only permanent (obligate) pathogen within the genus Bacillus. Its infection is a type of zoonosis, as it is transmitted from animals to humans. It was discovered by a German physician Robert Koch in 1876, and became the first bacterium to be experimentally shown as a pathogen. The discovery was also the first scientific evidence for the germ theory of diseases.B. anthracis measures about 3 to 5 μm long and 1 to 1.2 μm wide. The reference genome consists of a 5,227,419 bp circular chromosome and two extrachromosomal DNA plasmids, pXO1 and pXO2, of 181,677 and 94,830 bp respectively, which are responsible for the pathogenicity. It forms a protective layer called endospore by which it can remain inactive for many years and suddenly becomes infective under suitable environmental conditions. Because of the resilience of the endospore, the bacterium is one of the most popular biological weapons. The protein capsule (poly-D-gamma-glutamic acid) is key to evasion of the immune response. It feeds on the heme of blood protein haemoglobin using two secretory siderophore proteins, IsdX1 and IsdX2.\n\nUntreated B. anthracis infection is usually deadly. Infection is indicated by inflammatory, black, necrotic lesion (eschar). The sores usually appear on the face, neck, arms, or hands. The fatal symptoms include flu-like fever, chest discomfort, diaphoresis, and body aches. The first animal vaccine against anthrax was developed by French chemist Louis Pasteur in 1881. Different animal and human vaccines are now available. The infection can be treated with common antibiotics such as penicillins, quinolones, and tetracyclines. | Bacteria | en | bacillus anthracis grampositive rodshaped bacterium cause anthrax deadly disease livestock occasionally human permanent obligate pathogen within genus bacillus infection type zoonosis transmitted animal human discovered german physician robert koch 1876 became first bacterium experimentally shown pathogen discovery also first scientific evidence germ theory diseasesb anthracis measure 3 5 μm long 1 12 μm wide reference genome consists 5227419 bp circular chromosome two extrachromosomal dna plasmid pxo1 pxo2 181677 94830 bp respectively responsible pathogenicity form protective layer called endospore remain inactive many year suddenly becomes infective suitable environmental condition resilience endospore bacterium one popular biological weapon protein capsule polydgammaglutamic acid key evasion immune response feed heme blood protein haemoglobin using two secretory siderophore protein isdx1 isdx2 untreated b anthracis infection usually deadly infection indicated inflammatory black necrotic lesion eschar sore usually appear face neck arm hand fatal symptom include flulike fever chest discomfort diaphoresis body ache first animal vaccine anthrax developed french chemist louis pasteur 1881 different animal human vaccine available infection treated common antibiotic penicillin quinolones tetracycline | 274 | 1516 | 22 | 5.532847 | 12.454545 | 2 | 17 | 8.5 | 0.086905 | (bacillus, anthracis, grampositive, rodshaped, bacterium, cause, anthrax, deadly, disease, livestock, occasionally, human, permanent, obligate, pathogen, within, genus, bacillus, infection, type, zoonosis, transmitted, animal, human, discovered, german, physician, robert, koch, 1876, became, first, bacterium, experimentally, shown, pathogen, discovery, also, first, scientific, evidence, germ, theory, diseasesb, anthracis, measure, 3, 5, μm, long, 1, 12, μm, wide, reference, genome, consists, 5227419, bp, circular, chromosome, two, extrachromosomal, dna, plasmid, pxo1, pxo2, 181677, 94830, bp, respectively, responsible, pathogenicity, form, protective, layer, called, endospore, remain, inactive, many, year, suddenly, becomes, infective, suitable, environmental, condition, resilience, endospore, bacterium, one, popular, biological, weapon, protein, capsule, polydgammaglutamic, acid, key, ...) | [(German, NORP), (Robert Koch, PERSON), (1876, DATE), (first, ORDINAL), (first, ORDINAL), (about 3, CARDINAL), (1, CARDINAL), (1.2 μm, QUANTITY), (5,227,419, CARDINAL), (two, CARDINAL), (181,677, CARDINAL), (94,830, CARDINAL), (many years, DATE), (two, CARDINAL), (IsdX1, ORG), (IsdX2, ORG), (first, ORDINAL), (French, NORP), (Louis Pasteur, PERSON), (1881, DATE)] | [7, 3, 0, 0, 0, 0, 0, 0, 0, 2, 3, 2, 0, 2, 0, 1, 0, 0] | 7 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 3 | 2 | 0 | 2 | 0 | 1 | 0 | 0 |
| 9 | 9 | Bacillus cereus | Bacillus cereus is a Gram-positive, rod-shaped, facultatively anaerobic, motile, beta-hemolytic, spore-forming bacterium commonly found in soil, food and marine sponges. The specific name, cereus, meaning "waxy" in Latin, refers to the appearance of colonies grown on blood agar. Some strains are harmful to humans and cause foodborne illness, while other strains can be beneficial as probiotics for animals. The bacteria is classically contracted from fried rice dishes that have been sitting at room temperature for hours. B. cereus bacteria are facultative anaerobes, and like other members of the genus Bacillus, can produce protective endospores. Its virulence factors include phospholipase C, cereulide, sphingomyelinase, metalloproteases, and cytotoxin K.The Bacillus cereus group comprises seven closely related species: B. cereus sensu stricto (referred to herein as B. cereus), B. anthracis, B. thuringiensis, B. mycoides, B. pseudomycoides, and B. cytotoxicus; or as six species in a Bacillus cereus sensu lato: B. weihenstephanensis, B. mycoides, B. pseudomycoides, B. cereus, B. thuringiensis, and B. anthracis. | Bacteria | en | bacillus cereus grampositive rodshaped facultatively anaerobic motile betahemolytic sporeforming bacterium commonly found soil food marine sponge specific name cereus meaning waxy latin refers appearance colony grown blood agar strain harmful human cause foodborne illness strain beneficial probiotic animal bacteria classically contracted fried rice dish sitting room temperature hour b cereus bacteria facultative anaerobe like member genus bacillus produce protective endospore virulence factor include phospholipase c cereulide sphingomyelinase metalloproteases cytotoxin kthe bacillus cereus group comprises seven closely related specie b cereus sensu stricto referred herein b cereus b anthracis b thuringiensis b mycoides b pseudomycoides b cytotoxicus six specie bacillus cereus sensu lato b weihenstephanensis b mycoides b pseudomycoides b cereus b thuringiensis b anthracis | 159 | 967 | 22 | 6.081761 | 7.227273 | 2 | 14 | 7.0 | -0.091667 | (bacillus, cereus, grampositive, rodshaped, facultatively, anaerobic, motile, betahemolytic, sporeforming, bacterium, commonly, found, soil, food, marine, sponge, specific, name, cereus, meaning, waxy, latin, refers, appearance, colony, grown, blood, agar, strain, harmful, human, cause, foodborne, illness, strain, beneficial, probiotic, animal, bacteria, classically, contracted, fried, rice, dish, sitting, room, temperature, hour, b, cereus, bacteria, facultative, anaerobe, like, member, genus, bacillus, produce, protective, endospore, virulence, factor, include, phospholipase, c, cereulide, sphingomyelinase, metalloproteases, cytotoxin, kthe, bacillus, cereus, group, comprises, seven, closely, related, specie, b, cereus, sensu, stricto, referred, herein, b, cereus, b, anthracis, b, thuringiensis, b, mycoides, b, pseudomycoides, b, cytotoxicus, six, specie, bacillus, cereus, ...) | [(Latin, LANGUAGE), (hours, TIME), (cereulide, ORG), (sphingomyelinase, ORG), (seven, CARDINAL), (six, CARDINAL), (B. weihenstephanensis, PERSON)] | [2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 1, 0] | 2 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 1 | 0 | 0 | 1 | 0 |
Last rows
| df_index | Name | Description | Type | lang | Description_clean | word_count | char_count | sentence_count | avg_word_length | avg_sentence_length | name_word_count | name_char_count | name_avg_word_length | Polarity | parsed | entity_tags | entity_types | CARDINAL | DATE | EVENT | FAC | GPE | LANGUAGE | LAW | LOC | MONEY | NORP | ORDINAL | ORG | PERCENT | PERSON | PRODUCT | QUANTITY | TIME | WORK_OF_ART | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 169 | 44 | Staphylococcus virus G1 | Staphylococcus virus G1 is a virus of the family Herelleviridae, genus Kayvirus.As a member of the group I of the Baltimore classification, Staphylococcus virus G1 is a dsDNA virus. All the family Herelleviridae members share a nonenveloped morphology consisting of a head and a tail separated by a neck. Its genome is linear. The propagation of the virions includes the attaching to a host cell (a bacterium, as Staphylococcus virus G1 is a bacteriophage) and the injection of the double stranded DNA; the host transcribes and translates it to manufacture new particles. To replicate its genetic content requires host cell DNA polymerases and, hence, the process is highly dependent on the cell cycle.The Gp67 protein of G1 has been found to interact with its host's RNA polymerase though an interaction with a sigma factor.The phage contains a genome of 138,715 base pairs with a 30.4% of GC content and 214 predicted genes; this means that the 88.5% of the DNA is coding open reading frames, and therefore the gene density (the number of genes per kilobase) is 1.54.\n\n\n== References == | Bacteriophage | en | staphylococcus virus g1 virus family herelleviridae genus kayvirusas member group baltimore classification staphylococcus virus g1 dsdna virus family herelleviridae member share nonenveloped morphology consisting head tail separated neck genome linear propagation virion includes attaching host cell bacterium staphylococcus virus g1 bacteriophage injection double stranded dna host transcribes translates manufacture new particle replicate genetic content requires host cell dna polymerase hence process highly dependent cell cyclethe gp67 protein g1 found interact host rna polymerase though interaction sigma factorthe phage contains genome 138715 base pair 304 gc content 214 predicted gene mean 885 dna coding open reading frame therefore gene density number gene per kilobase 154 reference | 180 | 909 | 12 | 5.050000 | 15.000000 | 3 | 21 | 7.000000 | -0.100727 | (staphylococcus, virus, g1, virus, family, herelleviridae, genus, kayvirusas, member, group, baltimore, classification, staphylococcus, virus, g1, dsdna, virus, family, herelleviridae, member, share, nonenveloped, morphology, consisting, head, tail, separated, neck, genome, linear, propagation, virion, includes, attaching, host, cell, bacterium, staphylococcus, virus, g1, bacteriophage, injection, double, stranded, dna, host, transcribes, translates, manufacture, new, particle, replicate, genetic, content, requires, host, cell, dna, polymerase, hence, process, highly, dependent, cell, cyclethe, gp67, protein, g1, found, interact, host, rna, polymerase, though, interaction, sigma, factorthe, phage, contains, genome, 138715, base, pair, 304, gc, content, 214, predicted, gene, mean, 885, dna, coding, open, reading, frame, therefore, gene, density, number, ...) | [(Herelleviridae, PERSON), (Kayvirus, ORG), (Baltimore, GPE), (Herelleviridae, PERSON), (Gp67, PERSON), (G1, PRODUCT), (138,715, CARDINAL), (30.4%, PERCENT), (GC, ORG), (214, CARDINAL), (88.5%, PERCENT), (1.54, CARDINAL)] | [3, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 2, 2, 3, 1, 0, 0, 0] | 3 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 2 | 3 | 1 | 0 | 0 | 0 |
| 170 | 45 | Streptomyces phage Φ0 | Streptomyces phage Φ0 is a bacteriophage that infects Streptomyces. It was discovered in 2016. The bacteriophage contains a double-stranded RNA genome and probably belongs to the Cystoviridae family.\n\n\n== References == | Bacteriophage | en | streptomyces phage φ0 bacteriophage infects streptomyces discovered 2016 bacteriophage contains doublestranded rna genome probably belongs cystoviridae family reference | 30 | 189 | 4 | 6.300000 | 7.500000 | 3 | 19 | 6.333333 | 0.000000 | (streptomyces, phage, φ0, bacteriophage, infects, streptomyces, discovered, 2016, bacteriophage, contains, doublestranded, rna, genome, probably, belongs, cystoviridae, family, reference) | [(Streptomyces, ORG), (Streptomyces, FAC), (2016, DATE), (Cystoviridae, PERSON)] | [0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0] | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 171 | 46 | T4 rII system | The T4 rII system is an experimental system developed in the 1950s by Seymour Benzer for studying the substructure of the gene. The experimental system is based on genetic crosses of different mutant strains of bacteriophage T4, a virus that infects the bacteria E. coli. | Bacteriophage | en | t4 rii system experimental system developed 1950s seymour benzer studying substructure gene experimental system based genetic cross different mutant strain bacteriophage t4 virus infects bacteria e coli | 46 | 227 | 4 | 4.934783 | 11.500000 | 3 | 11 | 3.666667 | 0.075000 | (t4, rii, system, experimental, system, developed, 1950s, seymour, benzer, studying, substructure, gene, experimental, system, based, genetic, cross, different, mutant, strain, bacteriophage, t4, virus, infects, bacteria, e, coli) | [(T4, ORG), (the 1950s, DATE), (Seymour Benzer, PERSON), (T4, ORG)] | [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0] | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 1 | 0 | 0 | 0 | 0 |
| 172 | 47 | Tectivirus | Tectiviridae is a family of viruses with 10 species in five genera. Bacteria serve as natural hosts. Tectiviruses have no head-tail structure, but are capable of producing tail-like tubes of ~ 60×10 nm upon adsorption or after chloroform treatment. The name is derived from Latin tectus (meaning 'covered'). | Bacteriophage | en | tectiviridae family virus 10 specie five genus bacteria serve natural host tectiviruses headtail structure capable producing taillike tube 6010 nm upon adsorption chloroform treatment name derived latin tectus meaning covered | 48 | 260 | 5 | 5.416667 | 9.600000 | 1 | 10 | 10.000000 | 0.150000 | (tectiviridae, family, virus, 10, specie, five, genus, bacteria, serve, natural, host, tectiviruses, headtail, structure, capable, producing, taillike, tube, 6010, nm, upon, adsorption, chloroform, treatment, name, derived, latin, tectus, meaning, covered) | [(Tectiviridae, GPE), (10, CARDINAL), (five, CARDINAL), (Latin, NORP)] | [2, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0] | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 173 | 48 | Temperateness (virology) | In virology, temperate refers to the ability of some bacteriophages (notably coliphage λ) to display a lysogenic life cycle. Many (but not all) temperate phages can integrate their genomes into their host bacterium's chromosome, together becoming a lysogen as the phage genome becomes a prophage. A temperate phage is also able to undergo a productive, typically lytic life cycle, where the prophage is expressed, replicates the phage genome, and produces phage progeny, which then leave the bacterium. With phage the term virulent is often used as an antonym to temperate, but more strictly a virulent phage is one that has lost its ability to display lysogeny through mutation rather than a phage lineage with no genetic potential to ever display lysogeny (which more properly would be described as an obligately lytic phage). | Bacteriophage | en | virology temperate refers ability bacteriophage notably coliphage λ display lysogenic life cycle many temperate phage integrate genome host bacterium chromosome together becoming lysogen phage genome becomes prophage temperate phage also able undergo productive typically lytic life cycle prophage expressed replicates phage genome produce phage progeny leave bacterium phage term virulent often used antonym temperate strictly virulent phage one lost ability display lysogeny mutation rather phage lineage genetic potential ever display lysogeny properly would described obligately lytic phage | 133 | 697 | 5 | 5.240602 | 26.600000 | 2 | 23 | 11.500000 | 0.309259 | (virology, temperate, refers, ability, bacteriophage, notably, coliphage, λ, display, lysogenic, life, cycle, many, temperate, phage, integrate, genome, host, bacterium, chromosome, together, becoming, lysogen, phage, genome, becomes, prophage, temperate, phage, also, able, undergo, productive, typically, lytic, life, cycle, prophage, expressed, replicates, phage, genome, produce, phage, progeny, leave, bacterium, phage, term, virulent, often, used, antonym, temperate, strictly, virulent, phage, one, lost, ability, display, lysogeny, mutation, rather, phage, lineage, genetic, potential, ever, display, lysogeny, properly, would, described, obligately, lytic, phage) | [] | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 174 | 49 | Transduction (genetics) | Transduction is the process by which foreign DNA is introduced into a cell by a virus or viral vector. An example is the viral transfer of DNA from one bacterium to another and hence an example of horizontal gene transfer. Transduction does not require physical contact between the cell donating the DNA and the cell receiving the DNA (which occurs in conjugation), and it is DNase resistant (transformation is susceptible to DNase). Transduction is a common tool used by molecular biologists to stably introduce a foreign gene into a host cell's genome (both bacterial and mammalian cells). | Bacteriophage | en | transduction process foreign dna introduced cell virus viral vector example viral transfer dna one bacterium another hence example horizontal gene transfer transduction require physical contact cell donating dna cell receiving dna occurs conjugation dnase resistant transformation susceptible dnase transduction common tool used molecular biologist stably introduce foreign gene host cell genome bacterial mammalian cell | 97 | 495 | 5 | 5.103093 | 19.400000 | 2 | 22 | 11.000000 | -0.137500 | (transduction, process, foreign, dna, introduced, cell, virus, viral, vector, example, viral, transfer, dna, one, bacterium, another, hence, example, horizontal, gene, transfer, transduction, require, physical, contact, cell, donating, dna, cell, receiving, dna, occurs, conjugation, dnase, resistant, transformation, susceptible, dnase, transduction, common, tool, used, molecular, biologist, stably, introduce, foreign, gene, host, cell, genome, bacterial, mammalian, cell) | [] | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 175 | 50 | Viral plaque | A viral plaque is a visible structure formed after introducing a viral sample to a cell culture grown on some nutrient medium. The virus will replicate and spread, generating regions of cell destruction known as plaques. For example, Vero cell or other tissue cultures may be used to investigate an influenza virus or coronavirus, while various bacterial cultures would be used for bacteriophages.\nCounting the number of plaques can be used as a method of virus quantification. These plaques can sometimes be detected visually using colony counters, in much the same way as bacterial colonies are counted; however, they are not always visible to the naked eye, and sometimes can only be seen through a microscope, or using techniques such as staining (e.g. neutral red for eukaryotes or giemsa for bacteria) or immunofluorescence. Special computer systems have been designed with the ability to scan samples in batches.\n\nThe appearance of the plaque depends on the host strain, virus and the conditions. Highly virulent or lytic strains create plaques that look clear (due to total cell destruction), while strains that only kill a fraction of their hosts (due to partial resistance/lysogeny), or only reduce the rate of cell growth, give turbid plaques. Some partially lysogenic phages give bull's-eye plaques with spots or rings of growth in the middle of clear regions of complete lysis. | Bacteriophage | en | viral plaque visible structure formed introducing viral sample cell culture grown nutrient medium virus replicate spread generating region cell destruction known plaque example vero cell tissue culture may used investigate influenza virus coronavirus various bacterial culture would used bacteriophage counting number plaque used method virus quantification plaque sometimes detected visually using colony counter much way bacterial colony counted however always visible naked eye sometimes seen microscope using technique staining eg neutral red eukaryote giemsa bacteria immunofluorescence special computer system designed ability scan sample batch appearance plaque depends host strain virus condition highly virulent lytic strain create plaque look clear due total cell destruction strain kill fraction host due partial resistancelysogeny reduce rate cell growth give turbid plaque partially lysogenic phage give bullseye plaque spot ring growth middle clear region complete lysis | 223 | 1170 | 12 | 5.246637 | 18.583333 | 2 | 11 | 5.500000 | 0.020097 | (viral, plaque, visible, structure, formed, introducing, viral, sample, cell, culture, grown, nutrient, medium, virus, replicate, spread, generating, region, cell, destruction, known, plaque, example, vero, cell, tissue, culture, may, used, investigate, influenza, virus, coronavirus, various, bacterial, culture, would, used, bacteriophage, counting, number, plaque, used, method, virus, quantification, plaque, sometimes, detected, visually, using, colony, counter, much, way, bacterial, colony, counted, however, always, visible, naked, eye, sometimes, seen, microscope, using, technique, staining, eg, neutral, red, eukaryote, giemsa, bacteria, immunofluorescence, special, computer, system, designed, ability, scan, sample, batch, appearance, plaque, depends, host, strain, virus, condition, highly, virulent, lytic, strain, create, plaque, look, clear, due, ...) | [(Vero, ORG)] | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| 176 | 51 | Viral shunt | The viral shunt is a mechanism that prevents marine microbial particulate organic matter (POM) from migrating up trophic levels by recycling them into dissolved organic matter (DOM), which can be readily taken up by microorganisms. The DOM recycled by the viral shunt pathway is comparable to the amount generated by the other main sources of marine DOM.Viruses can easily infect microorganisms in the microbial loop due to their relative abundance compared to microbes. Prokaryotic and eukaryotic mortality contribute to carbon nutrient recycling through cell lysis. There is evidence as well of nitrogen (specifically ammonium) regeneration. This nutrient recycling helps stimulates microbial growth. As much as 25% of the primary production from phytoplankton in the global oceans may be recycled within the microbial loop through the viral shunt. | Bacteriophage | en | viral shunt mechanism prevents marine microbial particulate organic matter pom migrating trophic level recycling dissolved organic matter dom readily taken microorganism dom recycled viral shunt pathway comparable amount generated main source marine domviruses easily infect microorganism microbial loop due relative abundance compared microbe prokaryotic eukaryotic mortality contribute carbon nutrient recycling cell lysis evidence well nitrogen specifically ammonium regeneration nutrient recycling help stimulates microbial growth much 25 primary production phytoplankton global ocean may recycled within microbial loop viral shunt | 129 | 724 | 8 | 5.612403 | 16.125000 | 2 | 10 | 5.000000 | 0.127778 | (viral, shunt, mechanism, prevents, marine, microbial, particulate, organic, matter, pom, migrating, trophic, level, recycling, dissolved, organic, matter, dom, readily, taken, microorganism, dom, recycled, viral, shunt, pathway, comparable, amount, generated, main, source, marine, domviruses, easily, infect, microorganism, microbial, loop, due, relative, abundance, compared, microbe, prokaryotic, eukaryotic, mortality, contribute, carbon, nutrient, recycling, cell, lysis, evidence, well, nitrogen, specifically, ammonium, regeneration, nutrient, recycling, help, stimulates, microbial, growth, much, 25, primary, production, phytoplankton, global, ocean, may, recycled, within, microbial, loop, viral, shunt) | [(DOM, ORG), (DOM, ORG), (As much as 25%, PERCENT)] | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 0, 0, 0, 0] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 |
| 177 | 52 | Auxiliary metabolic genes | Auxiliary metabolic genes (AMGs) are found in many bacteriophages but originated in bacterial cells. AMGs modulate host cell metabolism during infection so that the phage can replicate more efficiently. For instance, bacteriophages that infect the abundant marine cyanobacteria Synechococcus and Prochlorococcus (cyanophages) carry AMGs that have been acquired from their immediate host as well as more distantly-related bacteria. Cyanophage AMGs support a variety of functions including photosynthesis, carbon metabolism, nucleic acid synthesis and metabolism.\n\n\n== References == | Bacteriophage | en | auxiliary metabolic gene amgs found many bacteriophage originated bacterial cell amgs modulate host cell metabolism infection phage replicate efficiently instance bacteriophage infect abundant marine cyanobacteria synechococcus prochlorococcus cyanophages carry amgs acquired immediate host well distantlyrelated bacteria cyanophage amgs support variety function including photosynthesis carbon metabolism nucleic acid synthesis metabolism reference | 76 | 505 | 5 | 6.644737 | 15.200000 | 3 | 23 | 7.666667 | 0.525000 | (auxiliary, metabolic, gene, amgs, found, many, bacteriophage, originated, bacterial, cell, amgs, modulate, host, cell, metabolism, infection, phage, replicate, efficiently, instance, bacteriophage, infect, abundant, marine, cyanobacteria, synechococcus, prochlorococcus, cyanophages, carry, amgs, acquired, immediate, host, well, distantlyrelated, bacteria, cyanophage, amgs, support, variety, function, including, photosynthesis, carbon, metabolism, nucleic, acid, synthesis, metabolism, reference) | [(Synechococcus, GPE), (Prochlorococcus, ORG)] | [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0] | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| 178 | 53 | WO virus | WO virus is bacteriophage virus that infects bacteria of the genus Wolbachia, which it is named after. This virus is notable for carrying DNA related to the black widow spider toxin gene, becoming an example of a bacteriophage with animal-like DNA, implying DNA transfers between eukaryotes and bacteriophages.\n\n\n== References == | Bacteriophage | en | wo virus bacteriophage virus infects bacteria genus wolbachia named virus notable carrying dna related black widow spider toxin gene becoming example bacteriophage animallike dna implying dna transfer eukaryote bacteriophage reference | 50 | 280 | 3 | 5.600000 | 16.666667 | 2 | 7 | 3.500000 | 0.195833 | (wo, virus, bacteriophage, virus, infects, bacteria, genus, wolbachia, named, virus, notable, carrying, dna, related, black, widow, spider, toxin, gene, becoming, example, bacteriophage, animallike, dna, implying, dna, transfer, eukaryote, bacteriophage, reference) | [(Wolbachia, PERSON)] | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |